# NVIDIA DGX-2

At the time of its <mark style="color:yellow;">2019</mark> release, traditional data centre architectures were increasingly unable to cope with the demands of modern AI workloads, which require immense computational power and high-speed interconnects to train increasingly complex models.&#x20;

This challenge necessitated a paradigm shift towards more scalable and integrated systems.&#x20;

NVIDIA's response to this challenge was the <mark style="color:blue;">**DGX-2**</mark>, a system designed to offer unprecedented levels of compute performance and interconnect bandwidth, enabling the training of models that were previously untrainable due to hardware limitations.

Nvidia's DGX-2 stood as a major leap forward. When it was released, it claimed the title of <mark style="color:yellow;">**"the world's most powerful AI system for the most complex AI challenges."**</mark>&#x20;

The system came with a price tag around $US400,000.

### <mark style="color:purple;">The Evolution from DGX-1 to DGX-2</mark>

The DGX-2 expanded on DGX-1 foundation dramatically.&#x20;

Instead of eight GPUs, it packed <mark style="color:yellow;">**16 GPUs**</mark> and replaced the NVLink bus with Nvidia’s more scalable <mark style="color:blue;">**NVSwitch**</mark> technology.&#x20;

This change allowed the DGX-2 to tackle deep learning and other demanding AI and HPC workloads up to 10 times faster than the DGX-1.

The system was a behemoth, both in terms of size and capability.&#x20;

It weighed in at <mark style="color:yellow;">**154.2kg**</mark> (340lbs) and took up <mark style="color:yellow;">**10 rack units**</mark>, compared to the 3 rack units of the DGX-1.&#x20;

It required up to 10kW of power, a figure that rose with the introduction of the DGX-2H model, which demanded up to 12kW.

### <mark style="color:purple;">A Closer Look at the DGX-2</mark>

Here’s what made the DGX-2 stand out:

* <mark style="color:blue;">**GPUs:**</mark> The DGX-2 featured <mark style="color:yellow;">**16 NVIDIA Tesla V100 GPUs**</mark>. This doubling of GPU capacity, compared to the DGX-1, allowed for unprecedented computational power.
* <mark style="color:blue;">**Memory and Storage:**</mark> It came with 1.5 TB of system RAM and 30 TB of high-performance [NVMe SSD storage](#user-content-fn-1)[^1], expandable to 60 TB.
* <mark style="color:blue;">**Networking:**</mark> The server was equipped with high-bandwidth network interfaces, including dual 10/25/40/50/100GbE options and up to 8 x 100Gb/sec Infiniband connectivity.
* <mark style="color:blue;">**CPU:**</mark> At its core, the DGX-2 had two 24-core Intel Xeon Platinum 8168 processors, providing robust support for the GPUs.

<figure><img src="https://1839612753-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FpV8SlQaC976K9PPsjApL%2Fuploads%2F6nOLmMvm9pLtQAYBlVSD%2Fimage.png?alt=media&#x26;token=874dfa0f-5744-4679-acd7-b26db754a32c" alt=""><figcaption></figcaption></figure>

### <mark style="color:purple;">Performance and Impact</mark>

The DGX-2’s performance was groundbreaking, delivering <mark style="color:yellow;">2 petaFLOPS</mark> of processing power.&#x20;

This level of performance meant that the *<mark style="color:yellow;">**DGX-2 could match the output of 300 dual-socket Xeon servers**</mark>*, which would cost around $2.7 million and occupy significantly more space.&#x20;

Thus, despite its high upfront cost, the DGX-2 presented a cost-effective solution for intensive AI and HPC workloads.

### <mark style="color:purple;">Legacy and Conclusion</mark>

Though alternatives have since emerged, at the time, the DGX-2 represented a pinnacle in AI-focused servers.

It addressed the needs of the most complex AI tasks by dramatically reducing the time and infrastructure required to train deep learning models. Nvidia not only sold a server but also *<mark style="color:yellow;">delivered a comprehensive ecosystem</mark>* that supported the most advanced AI research and applications.

### <mark style="color:purple;">NVIDIA NVSwitch—Revolutionising AI Network Fabric</mark>

The introduction of the <mark style="color:blue;">**NVIDIA NVSwitch**</mark> represented a leap in networking technology, akin to the evolution from dial-up to broadband.&#x20;

NVSwitch enables a level of model parallelism previously unattainable, providing <mark style="color:yellow;">2.4TB/s</mark> of bisection bandwidth, which is a 24 times increase over previous generations.

This high-performance interconnect fabric allows for unprecedented scaling capabilities, making it possible to train complex models across <mark style="color:yellow;">16</mark> GPUs efficiently and effectively.

<figure><img src="https://1839612753-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FpV8SlQaC976K9PPsjApL%2Fuploads%2FCtJDDUGTj0ho05U45HPl%2Fimage.png?alt=media&#x26;token=1bed50cb-a187-4cdd-9c7b-0196d453db27" alt=""><figcaption><p>The new NVSwitches means that the PCIe lanes of the CPUs can be redirected elsewhere, most notably towards storage and networking connectivity</p></figcaption></figure>

### <mark style="color:purple;">A comparison between the DXG-2 and the DGX-1</mark>

| Specification      | NVIDIA DGX-2                                                              | NVIDIA DGX-1                                                              |
| ------------------ | ------------------------------------------------------------------------- | ------------------------------------------------------------------------- |
| **CPUs**           | 2 x Intel Xeon Platinum                                                   | 2 x Intel Xeon E5-2600 v4                                                 |
| **GPUs**           | <mark style="color:yellow;">16</mark> x NVIDIA Tesla V100, 32GB HBM2 each | <mark style="color:yellow;">8</mark> x NVIDIA Tesla V100, 16 GB HBM2 each |
| **System Memory**  | Up to <mark style="color:yellow;">1.5</mark> TB DDR4                      | Up to <mark style="color:yellow;">0.5</mark> TB DDR4                      |
| **GPU Memory**     | <mark style="color:yellow;">512 GB</mark> HBM2 (16 x 32 GB)               | <mark style="color:yellow;">256 GB</mark> HBM2 (8 x 32 GB)                |
| **Storage**        | <mark style="color:yellow;">30 TB</mark> NVMe, expandable up to 60 TB     | <mark style="color:yellow;">4 x 1.92</mark> TB NVMe                       |
| **Networking**     | <mark style="color:yellow;">8</mark> x Infiniband or 8 x 100 GbE          | <mark style="color:yellow;">4</mark> x Infiniband + 2 x 10 GbE            |
| **Power**          | <mark style="color:yellow;">10</mark> kW                                  | <mark style="color:yellow;">3.5</mark> kW                                 |
| **Size**           | 350 lbs                                                                   | 134 lbs                                                                   |
| **GPU Throughput** | Tensor: 1920 TFLOPs, FP16: 480 TFLOPs, FP32: 240 TFLOPs, FP64: 120 TFLOPs | Tensor: 960 TFLOPs, FP16: 240 TFLOPs, FP32: 120 TFLOPs, FP64: 60 TFLOPs   |
| **Cost**           | $399,000                                                                  | $149,000                                                                  |

### <mark style="color:purple;">System Specifications</mark>

| **Component**               | **Specification**                                                                 |
| --------------------------- | --------------------------------------------------------------------------------- |
| GPUs                        | 16x NVIDIA® Tesla® V100                                                           |
| GPU Memory                  | 512GB total                                                                       |
| Performance                 | 2 petaFLOPS                                                                       |
| NVIDIA CUDA® Cores          | 81,920                                                                            |
| NVIDIA Tensor Cores         | 10,240                                                                            |
| NVSwitches                  | 12                                                                                |
| Maximum Power Usage         | 10 kW                                                                             |
| CPU                         | Dual Intel Xeon Platinum 8168, 2.7 GHz, 24-cores                                  |
| System Memory               | 1.5TB                                                                             |
| Network                     | 8x 100Gb/sec Infiniband/100GigE, Dual 10/25/40/50/100GbE                          |
| Storage                     | OS: 2x 960GB NVME SSDs, Internal Storage: 30TB (8x 3.84TB) NVME SSDs              |
| Software                    | Ubuntu Linux OS, Red Hat Enterprise Linux OS                                      |
| System Weight               | 360 lbs (163.29 kgs)                                                              |
| Packaged System Weight      | 400 lbs (181.44 kgs)                                                              |
| System Dimensions           | Height: 17.3 in, Width: 19.0 in, Length: 31.3 in (no bezel), 32.8 in (with bezel) |
| Operating Temperature Range | 5°C to 35°C (41°F to 95°F)                                                        |

[^1]: NVMe (Non-Volatile Memory Express) SSD storage is a type of solid-state drive (SSD) technology that uses the NVMe interface specification for accessing non-volatile storage media attached via PCI Express (PCIe) bus. NVMe SSDs are designed to take full advantage of the high-speed PCIe bus, significantly outperforming older storage interfaces like SATA in terms of speed, lower latency, and increased input/output operations per second (IOPS).&#x20;
