Vast Datastore
The VAST Datastore is the foundational layer of the VAST Data Platform - it is the core storage component of the VAST Data Platform.
It is a universal data store designed to simplify data management, provide open access to data, and give customers ownership of their data across various deployment scenarios.
Design Principles
Simplify everything: Focus on ease of deployment, management, maintenance, and scalability.
Open access: Support standard APIs and protocols to enable access by a wide range of applications.
Customer ownership: Provide flexibility in deployment across data centres, cloud, and edge environments.
Architecture
Disaggregated and Shared Everything (DASE) architecture: Separates the state layer (data) from the logic layer (compute), allowing for scalable processing and capacity without traditional trade-offs.
Element Store: A highly flexible, scalable, and efficient way to store data on hardware. It is a byte-addressable store that intelligently lays out metadata on the lowest latency flash and data on the lowest cost flash.
Element: The core data block of size 32KB, which is an abstraction of a file, object, or table.
Interfaces: Provides a series of interfaces (NFS, S3, SMB, SQL) for accessing data in the Element Store, enabling native access without protocol stacking.
Element Store
The Element Store is the foundation of the VAST Data Store, and it was built to address the challenges of storing and managing vast amounts of data efficiently while providing high-performance access to that data.
Separation of metadata and data
By laying out metadata on low-latency flash and data on low-cost flash, the Element Store optimises performance and cost.
Metadata, which is accessed more frequently and requires faster response times, is stored on high-performance, low-latency flash. This ensures that file and object lookups, directory traversals, and other metadata-heavy operations are executed quickly.
On the other hand, the actual data, which is typically larger in size and accessed less frequently, is stored on lower-cost, higher-capacity flash. This tiered approach allows VAST to strike a balance between performance and cost-efficiency.
Byte-addressable access
The Element Store provides byte-addressable access to data, meaning that it can read and write data at the granularity of individual bytes.
This is in contrast to traditional block-based storage systems, which operate at the granularity of fixed-size blocks (e.g., 4KB or 8KB). Byte-addressability enables the Element Store to efficiently handle a wide range of data types and access patterns, including small, random I/O operations common in database workloads and large, sequential I/O operations typical of analytics and AI/ML workloads.
Unified storage for diverse data types
The Element Store is designed to store and manage any type of data element, including files, objects, tables, and block devices.
This unified approach eliminates the need for separate storage silos for different data types and enables applications to access data through various protocols and APIs (e.g., NFS, S3, SMB, SQL). By providing a single, universal storage platform, VAST simplifies data management and reduces the complexity of data pipelines.
Use cases
AI and machine learning
The Element Store's ability to handle large, sequential I/O and provide fast access to vast amounts of data makes it well-suited for AI and machine learning workloads. Data scientists and engineers can store and process massive datasets, such as images, videos, and sensor data, without the need for complex data movement or transformation.
Analytics and business intelligence
The Element Store's support for SQL and its ability to efficiently handle small, random I/O make it an ideal platform for analytics and business intelligence workloads. Organizations can store structured and semi-structured data, such as logs, events, and time-series data, and perform real-time analytics and ad-hoc queries on that data.
Backup and archive
The Element Store's cost-efficiency and scalability make it an attractive option for backup and archival use cases. Organizations can store large volumes of backup data, including files, objects, and databases, on low-cost flash, while still maintaining fast access to that data for restore operations.
Cloud-native applications
The Element Store's support for object storage (S3) and its ability to scale seamlessly make it well-suited for cloud-native applications. Developers can build and deploy applications that leverage object storage without the need for complex data management or migration processes.
Other Features
Data Durability and Resilience
No concept of cache or unstable writes: All data written to the Element Store is stored in a stable manner.
Built-in data protection features: Includes encryption standards, external key management, and security features to support private and public cloud deployments.
Cost Efficiency
Uses the lowest cost flash available and extends its lifespan through software optimisations.
Achieves high flash utilisation (up to 97%) through custom erasure coding and RAID schemes.
Employs similarity-based data reduction, a new technique that combines the benefits of compression and deduplication, resulting in a 4:1 data reduction ratio across the customer fleet.
Performance
Provides the performance of a parallel file system with the simplicity of a network-attached storage platform.
Extends storage protocols (NFS, SMB) with RDMA and GPU Direct Storage to reduce data movement and accelerate job run times.
Enables stateless compute nodes to access all media and the global namespace in parallel, eliminating bottlenecks and ensuring high performance at scale.
Scalability
Designed for exabyte-scale deployments, surpassing the limitations of traditional petabyte-scale systems.
Eliminates the need for complex lock management, cache management, and east-west traffic, ensuring seamless scalability without performance degradation.
VAST Data Catalogue
Leverages the VAST Database to automatically index all metadata attributes related to data stored on the platform.
Enables users to gain insights and perform analytics on the stored data without maintaining separate metadata systems.
Supports user-defined metadata and allows for querying and searching metadata attributes at scale using simple SQL queries.
The VAST Datastore is designed to be a universal data store that can handle a wide range of workloads, including AI, machine learning, deep learning, HPC, and traditional transactional applications.
It eliminates the need for tiered storage architectures and enables customers to store and access all their data in a single, high-performance, and cost-effective platform.
By providing a unified data store with open interfaces and global scalability, the VAST Datastore serves as the foundation for the VAST Data Platform, enabling organisations to unlock insights from their data and support the demanding requirements of modern applications.
Last updated