High Bandwidth Memory (HBM3)
SK Hynix Inc
High Bandwidth Memory 3 (HBM3) is the latest generation of memory technology.
It is an advanced memory system that provides very high data transfer speeds (bandwidth), uses low power, and packs a large amount of memory (high capacity) into a small physical size (form factor).
HBM a type of memory architecture used in high-performance computing. It's known for its ability to provide extremely high memory bandwidth.
HBM3e is the latest generation in the HBM series, following HBM2 and HBM2E.
The 'e' in HBM3e denotes an enhanced version of the HBM3 standard.
And there may be further development, the market may soon witness a second generation of HBM3 devices, following the trend set by and LPDDR5, which have already seen speed upgrades
HBM also uses a very wide interface to the processor chip.
An interface is how two different parts of a system connect and communicate with each other. By using many parallel connections (like having many lanes on a highway), HBM can send and receive a massive amount of data to/from the processor simultaneously.

One of the most significant advantages of HBM3 is its increased storage capacity.
Supporting up to 32 Gb of density and a 16-high stack, HBM3 can provide a maximum of 64 GB of storage, almost triple that of HBM2E.
This expanded memory capacity is crucial for handling the increasing demands of advanced applications.

Data Transfer
In addition to its storage capabilities, HBM3 boasts speed, with a top data transfer rate of 6.4 Gbps, nearly doubling the speed of HBM2E (3.6 Gbps).
HBM memory stacks several chips vertically on a substrate, which is then connected to a processor or GPU via a silicon interposer.
Vertical Stacking and Silicon Interposer
HBM uses an innovative approach of stacking multiple DRAM dies on top of each other vertically.
DRAM stands for Dynamic Random Access Memory, which is the type of memory commonly used in computers. A die is a small block of semiconducting material on which a given functional circuit is fabricated. So an HBM stack has several DRAM dies stacked up.
HBM (High Bandwidth Memory) uses a unique architecture where multiple DRAM chips are stacked vertically on a substrate, rather than being placed side by side like in traditional memory layouts.
The stacked DRAM chips are connected to a processor or GPU using a , which is a thin layer of silicon that sits between the memory stack and the processor/GPU.
The silicon interposer contains a large number of tiny wires (interconnects) that enable high-speed communication between the stacked memory and the processor/GPU.
This vertical stacking and use of a silicon interposer allow for a much wider interface and higher bandwidth compared to traditional memory configurations.

The benefits of stacking
The DRAM dies are linked together using vertical interconnects called Through-Silicon Vias (TSVs).
A TSV is a vertical electrical connection that passes completely through a silicon die. It allows the stacked dies to communicate with each other much faster than traditional wire-bonding. Think of it as an elevator shaft that lets data move between different floors (dies) quickly.
This vertical stacking, combined with a wider interface, enables much higher bandwidth compared to traditional flat, planar layouts of DRAM.
This significant increase in bandwidth enables faster processing and improved overall system performance.
Power Efficiency
Another key benefit of HBM3 is its improved power efficiency.
HBM3 reduces the (the voltage supplied to the DRAM chips) to 1.1 volts from HBM2E's 1.2 volts.
This lower voltage means less power is consumed by the memory. This allows HBM3 to offer substantial power savings without compromising performance.
Remember, power consumption is proportional to the square of the voltage . So even a small reduction in voltage can have a significant impact on power efficiency. The challenge is maintaining signal integrity and data retention at lower voltages.
This improved power efficiency has permitted improvements in bandwidth, reliability
Bandwidth
HBM3 achieves this through an enhanced channel architecture, dividing its 1024-bit interface into 16 64-bit channels or 32 32-bit pseudo-channels.
What are pseudo-channels: HBM3 splits each physical 64-bit channel into two 32-bit "pseudo-channels". This effectively doubles the number of independent sub-channels from 16 to 32.
More pseudo-channels allow greater parallelism - more data can be accessed simultaneously from different regions of the DRAM. This improves bandwidth utilisation and performance.
However, the pseudo-channel logic does consume some additional power. The power savings from the core voltage reduction help offset this.
Nonetheless. this doubled number of pseudo-channels, combined with the increased data rate, results in a substantial performance improvement over HBM2E.
Reliability
HBM3 also incorporates advanced RAS (reliability, availability, and serviceability) features that enhance data integrity and system reliability.
On-die ECC
Error-Correcting Code (ECC) is a method of detecting and correcting bit errors in memory. HBM3 introduces on-die ECC, where the ECC bits are stored and the correction is performed within each DRAM die.
On-die ECC improves reliability by catching and fixing errors locally before data is transmitted to the host. However, the ECC circuits do add some power overhead. Careful design is needed to minimise this.
Error Check and Scrub (ECS)
This is a background process that periodically reads data from the DRAM, checks the ECC for errors, and writes back corrected data if necessary.
ECS helps maintain data integrity over time, preventing the accumulation of bit errors. The scrubbing does consume some additional power, but it is essential for mission-critical applications.
Refresh Management
DRAM cells lose their data over time due to charge leakage and must be periodically refreshed. HBM3 introduces advanced refresh management techniques like Refresh Management (RFM) and Adaptive Refresh Management (ARFM).
These allow the refresh rate to be optimised based on temperature and usage conditions. Unnecessary refreshes can be avoided, saving power. The refresh logic does add some complexity and power, but the net effect is a power savings.
Latency
The new clocking architecture in HBM3 decouples the traditional clock signal from the host to the device and the data strobe signals, allowing for a lower latency and high-performance solution when migrating from HBM2E to HBM3.
Clock architecture
HBM3 decouples the command/address clock from the data bus clock. The command clock runs at half the frequency of the data clock. This allows the DRAM I/O to run faster without burdening the core DRAM arrays.
Splitting the clocks does require some additional clock generation and synchronisation logic which consumes power. But it enables a significant data rate increase without a proportional power increase.
Summary
In conclusion, HBM3 represents a leap forward in memory technology, offering increased storage capacity, faster data transfer rates, improved power efficiency, and advanced features.
With its ability to meet the growing demands of high-performance computing applications, HBM3 is poised to become the memory solution of choice for industries seeking cutting-edge performance and efficiency.
As the adoption of HBM3 grows, we can expect to see groundbreaking advancements in graphics, cloud computing, networking, AI, and automotive sectors, propelling us into a new era of technological innovation.
Last updated
Was this helpful?