NVIDIA DGX-2

At the time of its 2019 release, traditional data centre architectures were increasingly unable to cope with the demands of modern AI workloads, which require immense computational power and high-speed interconnects to train increasingly complex models.

This challenge necessitated a paradigm shift towards more scalable and integrated systems.

NVIDIA's response to this challenge was the DGX-2, a system designed to offer unprecedented levels of compute performance and interconnect bandwidth, enabling the training of models that were previously untrainable due to hardware limitations.

Nvidia's DGX-2 stood as a major leap forward. When it was released, it claimed the title of "the world's most powerful AI system for the most complex AI challenges."

The system came with a price tag around $US400,000.

The Evolution from DGX-1 to DGX-2

The DGX-2 expanded on DGX-1 foundation dramatically.

Instead of eight GPUs, it packed 16 GPUs and replaced the NVLink bus with Nvidia’s more scalable NVSwitch technology.

This change allowed the DGX-2 to tackle deep learning and other demanding AI and HPC workloads up to 10 times faster than the DGX-1.

The system was a behemoth, both in terms of size and capability.

It weighed in at 154.2kg (340lbs) and took up 10 rack units, compared to the 3 rack units of the DGX-1.

It required up to 10kW of power, a figure that rose with the introduction of the DGX-2H model, which demanded up to 12kW.

A Closer Look at the DGX-2

Here’s what made the DGX-2 stand out:

GPUs: The DGX-2 featured 16 NVIDIA Tesla V100 GPUs. This doubling of GPU capacity, compared to the DGX-1, allowed for unprecedented computational power.
Memory and Storage: It came with 1.5 TB of system RAM and 30 TB of high-performance , expandable to 60 TB.
Networking: The server was equipped with high-bandwidth network interfaces, including dual 10/25/40/50/100GbE options and up to 8 x 100Gb/sec Infiniband connectivity.
CPU: At its core, the DGX-2 had two 24-core Intel Xeon Platinum 8168 processors, providing robust support for the GPUs.

Performance and Impact

The DGX-2’s performance was groundbreaking, delivering 2 petaFLOPS of processing power.

This level of performance meant that the DGX-2 could match the output of 300 dual-socket Xeon servers, which would cost around $2.7 million and occupy significantly more space.

Thus, despite its high upfront cost, the DGX-2 presented a cost-effective solution for intensive AI and HPC workloads.

Legacy and Conclusion

Though alternatives have since emerged, at the time, the DGX-2 represented a pinnacle in AI-focused servers.

It addressed the needs of the most complex AI tasks by dramatically reducing the time and infrastructure required to train deep learning models. Nvidia not only sold a server but also delivered a comprehensive ecosystem that supported the most advanced AI research and applications.

NVIDIA NVSwitch—Revolutionising AI Network Fabric

The introduction of the NVIDIA NVSwitch represented a leap in networking technology, akin to the evolution from dial-up to broadband.

NVSwitch enables a level of model parallelism previously unattainable, providing 2.4TB/s of bisection bandwidth, which is a 24 times increase over previous generations.

This high-performance interconnect fabric allows for unprecedented scaling capabilities, making it possible to train complex models across 16 GPUs efficiently and effectively.

A comparison between the DXG-2 and the DGX-1

Specification

NVIDIA DGX-2

NVIDIA DGX-1

CPUs

2 x Intel Xeon Platinum

2 x Intel Xeon E5-2600 v4

GPUs

16 x NVIDIA Tesla V100, 32GB HBM2 each

8 x NVIDIA Tesla V100, 16 GB HBM2 each

System Memory

Up to 1.5 TB DDR4

Up to 0.5 TB DDR4

GPU Memory

512 GB HBM2 (16 x 32 GB)

256 GB HBM2 (8 x 32 GB)

Storage

30 TB NVMe, expandable up to 60 TB

4 x 1.92 TB NVMe

Networking

8 x Infiniband or 8 x 100 GbE

4 x Infiniband + 2 x 10 GbE

Power

10 kW

3.5 kW

Size

350 lbs

134 lbs

GPU Throughput

Tensor: 1920 TFLOPs, FP16: 480 TFLOPs, FP32: 240 TFLOPs, FP64: 120 TFLOPs

Tensor: 960 TFLOPs, FP16: 240 TFLOPs, FP32: 120 TFLOPs, FP64: 60 TFLOPs

Cost

$399,000

$149,000

System Specifications

Component

Specification

GPUs

16x NVIDIA® Tesla® V100

GPU Memory

512GB total

Performance

2 petaFLOPS

NVIDIA CUDA® Cores

81,920

NVIDIA Tensor Cores

10,240

NVSwitches

Maximum Power Usage

10 kW

CPU

Dual Intel Xeon Platinum 8168, 2.7 GHz, 24-cores

System Memory

1.5TB

Network

8x 100Gb/sec Infiniband/100GigE, Dual 10/25/40/50/100GbE

Storage

OS: 2x 960GB NVME SSDs, Internal Storage: 30TB (8x 3.84TB) NVME SSDs

Software

Ubuntu Linux OS, Red Hat Enterprise Linux OS

System Weight

360 lbs (163.29 kgs)

Packaged System Weight

400 lbs (181.44 kgs)

System Dimensions

Height: 17.3 in, Width: 19.0 in, Length: 31.3 in (no bezel), 32.8 in (with bezel)

Operating Temperature Range

5°C to 35°C (41°F to 95°F)

PreviousNVIDIA DGX Servers NextNVIDIA DGX H-100 System

Last updated 1 year ago

Was this helpful?