# NVIDIA DGX-2

At the time of its <mark style="color:yellow;">2019</mark> release, traditional data centre architectures were increasingly unable to cope with the demands of modern AI workloads, which require immense computational power and high-speed interconnects to train increasingly complex models.&#x20;

This challenge necessitated a paradigm shift towards more scalable and integrated systems.&#x20;

NVIDIA's response to this challenge was the <mark style="color:blue;">**DGX-2**</mark>, a system designed to offer unprecedented levels of compute performance and interconnect bandwidth, enabling the training of models that were previously untrainable due to hardware limitations.

Nvidia's DGX-2 stood as a major leap forward. When it was released, it claimed the title of <mark style="color:yellow;">**"the world's most powerful AI system for the most complex AI challenges."**</mark>&#x20;

The system came with a price tag around $US400,000.

### <mark style="color:purple;">The Evolution from DGX-1 to DGX-2</mark>

The DGX-2 expanded on DGX-1 foundation dramatically.&#x20;

Instead of eight GPUs, it packed <mark style="color:yellow;">**16 GPUs**</mark> and replaced the NVLink bus with Nvidia’s more scalable <mark style="color:blue;">**NVSwitch**</mark> technology.&#x20;

This change allowed the DGX-2 to tackle deep learning and other demanding AI and HPC workloads up to 10 times faster than the DGX-1.

The system was a behemoth, both in terms of size and capability.&#x20;

It weighed in at <mark style="color:yellow;">**154.2kg**</mark> (340lbs) and took up <mark style="color:yellow;">**10 rack units**</mark>, compared to the 3 rack units of the DGX-1.&#x20;

It required up to 10kW of power, a figure that rose with the introduction of the DGX-2H model, which demanded up to 12kW.

### <mark style="color:purple;">A Closer Look at the DGX-2</mark>

Here’s what made the DGX-2 stand out:

* <mark style="color:blue;">**GPUs:**</mark> The DGX-2 featured <mark style="color:yellow;">**16 NVIDIA Tesla V100 GPUs**</mark>. This doubling of GPU capacity, compared to the DGX-1, allowed for unprecedented computational power.
* <mark style="color:blue;">**Memory and Storage:**</mark> It came with 1.5 TB of system RAM and 30 TB of high-performance [NVMe SSD storage](#user-content-fn-1)[^1], expandable to 60 TB.
* <mark style="color:blue;">**Networking:**</mark> The server was equipped with high-bandwidth network interfaces, including dual 10/25/40/50/100GbE options and up to 8 x 100Gb/sec Infiniband connectivity.
* <mark style="color:blue;">**CPU:**</mark> At its core, the DGX-2 had two 24-core Intel Xeon Platinum 8168 processors, providing robust support for the GPUs.

<figure><img src="https://1839612753-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FpV8SlQaC976K9PPsjApL%2Fuploads%2F6nOLmMvm9pLtQAYBlVSD%2Fimage.png?alt=media&#x26;token=874dfa0f-5744-4679-acd7-b26db754a32c" alt=""><figcaption></figcaption></figure>

### <mark style="color:purple;">Performance and Impact</mark>

The DGX-2’s performance was groundbreaking, delivering <mark style="color:yellow;">2 petaFLOPS</mark> of processing power.&#x20;

This level of performance meant that the *<mark style="color:yellow;">**DGX-2 could match the output of 300 dual-socket Xeon servers**</mark>*, which would cost around $2.7 million and occupy significantly more space.&#x20;

Thus, despite its high upfront cost, the DGX-2 presented a cost-effective solution for intensive AI and HPC workloads.

### <mark style="color:purple;">Legacy and Conclusion</mark>

Though alternatives have since emerged, at the time, the DGX-2 represented a pinnacle in AI-focused servers.

It addressed the needs of the most complex AI tasks by dramatically reducing the time and infrastructure required to train deep learning models. Nvidia not only sold a server but also *<mark style="color:yellow;">delivered a comprehensive ecosystem</mark>* that supported the most advanced AI research and applications.

### <mark style="color:purple;">NVIDIA NVSwitch—Revolutionising AI Network Fabric</mark>

The introduction of the <mark style="color:blue;">**NVIDIA NVSwitch**</mark> represented a leap in networking technology, akin to the evolution from dial-up to broadband.&#x20;

NVSwitch enables a level of model parallelism previously unattainable, providing <mark style="color:yellow;">2.4TB/s</mark> of bisection bandwidth, which is a 24 times increase over previous generations.

This high-performance interconnect fabric allows for unprecedented scaling capabilities, making it possible to train complex models across <mark style="color:yellow;">16</mark> GPUs efficiently and effectively.

<figure><img src="https://1839612753-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FpV8SlQaC976K9PPsjApL%2Fuploads%2FCtJDDUGTj0ho05U45HPl%2Fimage.png?alt=media&#x26;token=1bed50cb-a187-4cdd-9c7b-0196d453db27" alt=""><figcaption><p>The new NVSwitches means that the PCIe lanes of the CPUs can be redirected elsewhere, most notably towards storage and networking connectivity</p></figcaption></figure>

### <mark style="color:purple;">A comparison between the DXG-2 and the DGX-1</mark>

| Specification      | NVIDIA DGX-2                                                              | NVIDIA DGX-1                                                              |
| ------------------ | ------------------------------------------------------------------------- | ------------------------------------------------------------------------- |
| **CPUs**           | 2 x Intel Xeon Platinum                                                   | 2 x Intel Xeon E5-2600 v4                                                 |
| **GPUs**           | <mark style="color:yellow;">16</mark> x NVIDIA Tesla V100, 32GB HBM2 each | <mark style="color:yellow;">8</mark> x NVIDIA Tesla V100, 16 GB HBM2 each |
| **System Memory**  | Up to <mark style="color:yellow;">1.5</mark> TB DDR4                      | Up to <mark style="color:yellow;">0.5</mark> TB DDR4                      |
| **GPU Memory**     | <mark style="color:yellow;">512 GB</mark> HBM2 (16 x 32 GB)               | <mark style="color:yellow;">256 GB</mark> HBM2 (8 x 32 GB)                |
| **Storage**        | <mark style="color:yellow;">30 TB</mark> NVMe, expandable up to 60 TB     | <mark style="color:yellow;">4 x 1.92</mark> TB NVMe                       |
| **Networking**     | <mark style="color:yellow;">8</mark> x Infiniband or 8 x 100 GbE          | <mark style="color:yellow;">4</mark> x Infiniband + 2 x 10 GbE            |
| **Power**          | <mark style="color:yellow;">10</mark> kW                                  | <mark style="color:yellow;">3.5</mark> kW                                 |
| **Size**           | 350 lbs                                                                   | 134 lbs                                                                   |
| **GPU Throughput** | Tensor: 1920 TFLOPs, FP16: 480 TFLOPs, FP32: 240 TFLOPs, FP64: 120 TFLOPs | Tensor: 960 TFLOPs, FP16: 240 TFLOPs, FP32: 120 TFLOPs, FP64: 60 TFLOPs   |
| **Cost**           | $399,000                                                                  | $149,000                                                                  |

### <mark style="color:purple;">System Specifications</mark>

| **Component**               | **Specification**                                                                 |
| --------------------------- | --------------------------------------------------------------------------------- |
| GPUs                        | 16x NVIDIA® Tesla® V100                                                           |
| GPU Memory                  | 512GB total                                                                       |
| Performance                 | 2 petaFLOPS                                                                       |
| NVIDIA CUDA® Cores          | 81,920                                                                            |
| NVIDIA Tensor Cores         | 10,240                                                                            |
| NVSwitches                  | 12                                                                                |
| Maximum Power Usage         | 10 kW                                                                             |
| CPU                         | Dual Intel Xeon Platinum 8168, 2.7 GHz, 24-cores                                  |
| System Memory               | 1.5TB                                                                             |
| Network                     | 8x 100Gb/sec Infiniband/100GigE, Dual 10/25/40/50/100GbE                          |
| Storage                     | OS: 2x 960GB NVME SSDs, Internal Storage: 30TB (8x 3.84TB) NVME SSDs              |
| Software                    | Ubuntu Linux OS, Red Hat Enterprise Linux OS                                      |
| System Weight               | 360 lbs (163.29 kgs)                                                              |
| Packaged System Weight      | 400 lbs (181.44 kgs)                                                              |
| System Dimensions           | Height: 17.3 in, Width: 19.0 in, Length: 31.3 in (no bezel), 32.8 in (with bezel) |
| Operating Temperature Range | 5°C to 35°C (41°F to 95°F)                                                        |

[^1]: NVMe (Non-Volatile Memory Express) SSD storage is a type of solid-state drive (SSD) technology that uses the NVMe interface specification for accessing non-volatile storage media attached via PCI Express (PCIe) bus. NVMe SSDs are designed to take full advantage of the high-speed PCIe bus, significantly outperforming older storage interfaces like SATA in terms of speed, lower latency, and increased input/output operations per second (IOPS).&#x20;


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://training.continuumlabs.ai/infrastructure/servers-and-chips/nvidia-dgx-2.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
