High Bandwidth Memory (HBM)
HBM stacks DRAM chips vertically with through-silicon vias and places the stack adjacent to a compute die (CPU/GPU/ASIC) on a silicon interposer, producing 10-30× the bandwidth of DDR memory for the same power budget. HBM3E (2024) runs ~1.2 TB/s per stack. HBM4 (2025-2026) doubles that. Essential for modern AI accelerators. Expensive, thermally constrained, supplied primarily by SK Hynix, Samsung, and Micron — the HBM supply chain is a chokepoint in AI-chip production.
**High Bandwidth Memory (HBM)** is a stacked-DRAM memory technology where multiple DRAM dies are vertically integrated using **through-silicon vias (TSVs)** and placed next to a compute die (CPU, GPU, ASIC) on a **silicon interposer** substrate. It provides order-of-magnitude higher bandwidth than conventional DDR memory at similar or lower power, at significantly higher cost. ## Generations | Standard | Year | Per-stack bandwidth | Typical use | |---|---|---|---| | HBM1 | 2015 | ~128 GB/s | AMD Fury GPU | | HBM2 | 2016 | ~256 GB/s | Nvidia P100, V100; Intel KNL | | HBM2E | 2020 | ~460 GB/s | Nvidia A100, AMD MI250 | | HBM3 | 2022 | ~819 GB/s | Nvidia H100 | | HBM3E | 2024 | ~1.2 TB/s | Nvidia H200, B100/B200; Google TPU v5p | | HBM4 | 2025-2026 | ~2.0 TB/s | next-gen (Blackwell Ultra, TPU v6) | ## Why it matters AI accelerators are **memory-bandwidth-bound**, not compute-bound. The Memory Wall means most energy is spent moving data, not doing math. HBM dramatically improves the compute/move ratio: - H100 with HBM3: 80 GB of memory at 3 TB/s aggregate bandwidth. - Same capacity on DDR5 would be ~50 GB/s — roughly **60× slower** for random-access workloads. - Per-bit energy is also lower because the physical distance data travels is measured in mm rather than cm, and the interface is wider with lower voltage. Without HBM, modern large-language-model training and inference at reasonable economics would be impossible. HBM is arguably the single most strategic semiconductor component of the AI era, alongside leading-edge logic. ## Supply chain - **SK Hynix** (Korean) — current HBM3E market leader, primary supplier to Nvidia. - **Samsung** — strong historical position, lost some H100 supply to SK Hynix due to quality issues. - **Micron** (US) — third entrant, gaining share in HBM3E. - No Chinese domestic production at competitive generation yet. ## Strategic significance Because HBM production is concentrated in ~3 firms across Korea/US, and because Nvidia-class accelerators need HBM, the HBM supply chain is a **geopolitical chokepoint**. US/Korea controls on HBM exports to China are central to current AI-hardware sanctions regimes. HBM4's rollout timing is more important to the AI-chip market than most GPU architectural changes. ## Cost - HBM represents roughly **25-50% of bill-of-materials cost** on an H100-class accelerator. A single H100 has ~$2-3K of HBM3 in it. - HBM cost per GB has actually been rising in 2023-2025 due to demand > supply — unusual for memory, which historically trends downward. - Secondary markets exist: cloud GPU prices reflect HBM availability, not just silicon yield. ## Physical form - Stack heights: 4, 8, 12, or 16 DRAM dies (4-hi, 8-hi, 12-hi, 16-hi). - Typical stack: 16-32 GB per stack. - Typical package: 1-8 stacks around a compute die on a silicon interposer. - Interposer manufacturing (TSMC CoWoS process) is itself a supply-constrained step. ## Limitations - **Cost**: HBM is ~5-10× more expensive per GB than DDR5. - **Capacity**: total capacity per package is smaller than DDR systems (e.g., H100 has 80 GB HBM vs a DDR5 server with 2 TB). - **Thermal**: stacks run hot; thermal management of a 12-hi or 16-hi stack is nontrivial. - **Interposer supply**: the advanced packaging step is a distinct bottleneck from DRAM production itself. - **Upgradeability**: HBM is soldered to the interposer; no field upgrades. ## Alternatives - **DDR5 + large caches**: still viable for many workloads (general servers, some inference). - **LPDDR5X** with wide interface (Apple M-series, Snapdragon X Elite): bridges gap at lower cost. - **GDDR6X** / **GDDR7**: consumer-GPU memory, cheaper than HBM, lower bandwidth. - **Optical SRAM and the Photonic Latch**: speculative future replacement. ## Related - Memory Wall — the broader problem HBM partially solves. - Q.ANT Photonic AI Processor (NPU 2, 2026) — alternative compute paradigm that still depends on HBM at the memory boundary. - NVMe (Non-Volatile Memory Express) — faster storage below the DRAM tier.