Introduction to NVIDIA GPUDirect Storage (GDS)

NVIDIA GPUDirect Storage (GDS) is a technology that enables direct data transfer between GPU memory and storage devices, bypassing the traditional route through the CPU and system memory.

This approach offers several key benefits, including increased bandwidth, reduced latency, and decreased CPU utilisation.

In a traditional data transfer setup, data moves from storage to system memory, then from system memory to GPU memory, and vice versa.

This multi-step process involves the CPU, which can become a bottleneck, especially in data-intensive applications. GDS eliminates this bottleneck by establishing a direct data path between the GPU and storage, allowing for more efficient and faster data transfers.

Key features and benefits of GDS

Direct Memory Access (DMA)

GDS enables direct memory access between GPU memory and storage devices, such as NVMe SSDs or network-attached storage (NAS), without the need for intermediate data copies through the CPU or system memory.

Increased Bandwidth

By eliminating the need for data to pass through the CPU and system memory, GDS can achieve higher bandwidth for data transfers between storage and GPU memory. This is particularly beneficial for data-intensive applications that require fast access to large datasets.

Reduced Latency

GDS minimizes the latency associated with data transfers by reducing the number of steps involved in the process. The direct path between GPU memory and storage devices results in lower latency, enabling faster processing and analysis of data.

Decreased CPU Utilisation

With GDS, the CPU is no longer involved in the data transfer process, freeing up valuable CPU resources for other tasks. This can lead to improved overall system performance and efficiency.

Compatibility with CUDA

GDS integrates seamlessly with CUDA, allowing developers to take advantage of the technology using familiar CUDA programming models and APIs.

Under the hood, GDS leverages advanced features of modern GPU architectures and storage protocols to enable direct data transfer.

It utilizes the PCIe (Peripheral Component Interconnect Express) bus to establish a direct connection between the GPU and storage devices, bypassing the need for CPU involvement.

GDS also introduces a new set of APIs called cuFile, which provide a high-level interface for performing I/O operations directly between GPU memory and storage.

These APIs abstract the complexities of low-level I/O operations and offer a more user-friendly way for developers to incorporate GDS into their applications.

To take advantage of GDS, several components need to be in place

  1. NVIDIA GPU with GDS support: GDS requires a compatible NVIDIA GPU that supports the technology, such as the NVIDIA A100, V100, or RTX series GPUs.

  2. GDS-enabled storage devices: Storage devices, such as NVMe SSDs or NAS, must support GDS and be properly configured to enable direct data transfer with the GPU.

  3. GDS-enabled device drivers: Device drivers for both the GPU and storage devices must be GDS-enabled to facilitate the direct data path.

  4. CUDA and cuFile APIs: Applications must be developed using CUDA and the cuFile APIs to leverage GDS functionality.

By meeting these requirements and integrating GDS into their applications, developers can unlock significant performance improvements and optimize data transfer between storage and GPU memory.

GDS has broad applicability across various domains, including high-performance computing (HPC), data analytics, machine learning, and more.

It is particularly beneficial for applications that involve large-scale data processing, such as scientific simulations, big data analysis, and deep learning training.

Last updated


Continuum - Accelerated Artificial Intelligence

Continuum WebsiteAxolotl Platform

Copyright Continuum Labs - 2023