Exascale Log-structured Merge-trees at VAST Data

Moshe Gabel - VAST Data

Feb. 6, 2026, 2:30 p.m. - Feb. 6, 2026, 3:30 p.m.

ENGMD 279

Hosted by: Oana Balmau


The VAST Data Platform (AI OS) is a unified platform for storing, querying, processing, streaming, enriching and indexing all types of structured and unstructured data. It is built on a novel disaggregated architecture (DASE), allowing it to scale to exabytes of capacity while maintaining exceptional performance, reliability and ease of operation. The VAST Data Platform comprises a multi-protocol all-Flash storage solution (VAST DataStore), a transactional, analytical and vector database coupled with an event streaming broker (VAST DataBase) and a pipeline orchestration infrastructure (VAST DataEngine). VAST AI OS now powers some of the world's most demanding workloads: AI training, financial forecasting, high performance scientific computing (including Canada Compute), academic research, and more.This three-part talk will focus on the design and challenges of building the VAST DataBase: an LSM-based database with multi-version concurrency control that supports both analytical as well as vector workloads. Optimized for multi-PB sized tables with trillions of rows and vectors, the VAST DataBase provides high-performance during inserts, query and delete operations – with guaranteed transactional consistency, durability, isolation and availabilityWe will first briefly present the DASE architecture and the different platform components, and how they work together to address the needs of modern workloads. Next, we will focus on how LSMs are used in the VAST Database, with emphasis on some of the unique challenges caused by our scale and unique DASE distributed architecture, which renders many common approaches infeasible. Finally, we will discuss some of the challenges of supporting multiple consistent projections in different orders: building and maintaining a forest of mutually-consistent LSM trees, each sorted by different keys, while maintaining strong performance.

Dr. Moshe Gabel joined VAST Data from academia in 2025 as a senior software engineer. Before that, he was a professor of computer science at the University of Toronto and at York University.

VAST Data was founded in 2016 and is one of the fastest growing infrastructure companies in history. VAST Data is enabling the AI revolution with industry giants like CoreWeave and xAI being among its customers. The company has more than 1200 employees globally and a growing R&D center in the Toronto area. The Toronto branch operates as a startup-like organization developing the core of the database product and is engaged in the related research activities.