- The heart of an AI data center is widely believed to be the GPU.
- But the actual time GPUs spend computing is surprisingly low. Research shows that over 50% of attention kernel cycles during LLM inference are stalled waiting for data access (arXiv:2503.08311).
- The rest of the time? The GPU is just waiting for data.
- Think of it this way: even the brightest student loses efficiency if they cannot flip through their textbook fast enough.
- The technology that dramatically increases this “page-turning speed” is HBM — and its successor, HBF, is already emerging.
From PCs to GPUs — A Brief History of the Memory Bottleneck
GPUs Spend Half Their Time Idle
During LLM inference, over 50% of attention kernel cycles are stalled waiting for memory access. No matter how fast a GPU computes, if memory is slow, half the time is wasted waiting. HBM is the technology that dramatically reduces this idle time.
- In 1983, IBM released a personal computer called the XT.
- The XT ran on Intel’s 8088 CPU with RAM and storage, but running programs required manually swapping floppy disks.
- A single floppy disk’s capacity was so limited that running a word processor meant swapping several disks back and forth.
- In 1984, the AT arrived with a built-in hard drive, diminishing the floppy disk’s importance.
- CPUs kept getting faster — from XT to AT, 386, 486, and Pentium — and when Windows 3.0 launched in the 1990s, the graphics era truly began.
- When 3D games like DOOM pushed CPUs to their limits, NVIDIA’s GPU emerged to handle dedicated graphics processing.
- GPUs were born for gaming, but their ability to process calculations in parallel expanded their use cases far beyond.
- The CPU+GPU+DRAM architecture emerged, but DRAM’s problem was speed.
- No matter how fast the CPU and GPU could work, slow DRAM meant growing wait times.
- Imagine ordering food on a delivery app — the kitchen is fast, but the delivery driver is slow.
Half of GPU Time Is ‘Idle Staring’
During LLM inference, over 50% of attention kernel cycles are stalled due to data access latency. No matter how fast the GPU, if memory is slow, half the time is spent waiting. HBM is the technology that dramatically reduces this ‘idle time.’

The Birth of HBM and the HBM4 Era
- Enter HBM (High Bandwidth Memory) — memory that sits right next to the GPU and feeds it data at blazing speeds.
- HBM works by stacking memory chips in layers, positioned directly adjacent to the GPU to dramatically increase data transfer speeds.
- In February 2026, Samsung Electronics and SK hynix simultaneously began mass production of HBM4, the 6th generation HBM (Global Economic).
- This was an aggressive move, pushed 3–4 months ahead of the originally targeted mid-2026 timeline.
- HBM4’s base operating speed is 11.7Gbps, 46% faster than the JEDEC international standard of 8Gbps (The Elec).
- A single HBM4 unit costs approximately $700 (Aju Economy).
- While 20–30% more expensive than HBM3E, demand is surging as HBM is essential for AI data centers.
SK hynix vs. Samsung — The Market Share War
HBM Market Share (Q3 2025)
SK hynix
57%
Samsung
22%
Micron
21%
Source: Astute Group (Q3 2025)
HBM Market Share (2025 Q3)
Source: Astute Group (2025 Q3)
- The market share numbers tell a clear story: SK hynix is dominant.
- As of Q3 2025, SK hynix holds 57% of the HBM market, while Samsung sits at 22% (Astute Group).
- SK hynix is reportedly set to supply approximately 70% of HBM4 units for NVIDIA’s next-generation GPU platform, Rubin (TrendForce).
- Samsung has set a target to boost its share to the 30% range in 2026 and is aggressively closing the gap.
- SK hynix surpassing Samsung in annual profits for the first time in 2025 was a symbolic event showcasing the tectonic shift HBM has brought to the industry (CNBC).
The Rise of HBF — Solving the Next Bottleneck
HBM4 vs. HBF Comparison
HBM4 (Current Generation)
- DRAM stacking technology
- Capacity: 24–48 GB
- Ultra-fast working memory next to GPU
- Mass production began February 2026
- ~$700 per unit
HBF (Next Generation)
- NAND flash stacking technology
- 8–16x capacity vs. DRAM
- High-capacity, high-speed supplementary storage
- Target mass production: early 2027
- 2.69x power efficiency improvement
- Even with the CPU-GPU-HBM architecture, a bottleneck remains.
- HBM delivers data quickly on demand, but it is not suited for storing large volumes of data long-term.
- Finished computations and archival data are stored on NAND flash, but NAND sits far from the GPU-HBM pair and operates much slower — creating a significant time sink.
- This is where HBF (High Bandwidth Flash) enters the picture.
- The concept: stack NAND like HBM to increase capacity, then place it right next to HBM to boost speed.
HBM4 vs. HBF Detailed Specs
11.7Gbps
HBM4 Operating Speed
2.69x
HBF Power Efficiency Improvement
$700
HBM4 Unit Price
16x
HBF Capacity Multiplier
HBM4 vs HBF Detailed Specs
11.7Gbps
HBM4 Operating Speed
2.69x
HBF Power Efficiency Improvement
$700
HBM4 Unit Price
16x
HBF Capacity Multiplier
HBF (Next Gen)
- NAND Flash Stacking Technology
- 8~16x capacity vs DRAM
- High-Capacity Fast Auxiliary Storage
- Mass production target: early 2027
- 2.69x power efficiency improvement
HBM4 (Current Gen)
- DRAM Stacking Technology
- Capacity: 24~48 GB
- Ultra-fast working memory next to GPU
- Mass production started Feb 2026
- ~$700 per unit
margin:0;padding:4px 0 0 18px;font-size:14px;line-height:1.8;letter-spacing:-0.01em;”>
- HBF simulations show 8–16x higher capacity than DRAM with 2.69x better performance-per-watt efficiency (TrendForce).
- SK hynix unveiled a hybrid architecture called “H3” that integrates both HBM and HBF into a single design.
- Simulations combining NVIDIA’s latest Blackwell GPU (B200) with 8 stacks of HBM3E and 8 stacks of HBF showed significant LLM inference performance improvements (Blocks and Files).
Corporate HBF Strategies — Collaboration vs. Going Solo
- SK hynix and Samsung have taken sharply divergent approaches to HBF development.
- SK hynix is pursuing co-development + standardization with SanDisk, which has NAND expertise, targeting sample shipments in H2 2026 and mass production in early 2027 (SanDisk Official).
- Samsung has chosen the solo development path.
- Going forward, companies with HBM stacking expertise — like the femtosecond laser grooving technology used to thin HBM4 wafers — will have a structural advantage in HBF as well.
- HBM4 wafer thickness has decreased to 20–30 micrometers, hitting the limits of conventional cutting methods. SK hynix has announced the adoption of femtosecond (one-quadrillionth of a second) laser technology (ET News).
Market Outlook and Risks
- The total HBM market in 2026 is projected to reach $54.6 billion (approximately KRW 75 trillion), growing 58% year-over-year (BofA, SK hynix Newsroom).
- Analysts project Samsung and SK hynix are each targeting quarterly operating profit of KRW 30 trillion (Aju Economy). This was also discussed in our AI Infrastructure 12-Company Comparative Analysis.
- However, the U.S.-China semiconductor war presents a significant risk.
- When the U.S. banned HBM exports to China in December 2024, China retaliated by banning gallium and germanium exports to the U.S.
- With China controlling 98% of global gallium production, a worst-case scenario — banning U.S. exports of products made with Chinese rare earth materials — cannot be ruled out.
- South Korea’s strategic mineral reserves average just 56.8 days’ worth (Ministry of Trade, Industry and Energy). Japan, by comparison, stockpiles up to 180 days’ worth of high-risk minerals (IEA). That is more than a threefold gap.
- To fully capture the HBM supercycle’s benefits, supply chain risk preparedness must go hand in hand.
Supply Chain Risk Checkpoint
China controls 98% of global gallium production, and the U.S.-China semiconductor war is expanding into rare earth export controls. South Korea’s strategic mineral reserves (56.8 days) are one-third of Japan’s (up to 180 days). To fully benefit from the HBM supercycle, raw material supply chain diversification must come first.
Bottom Line. HBM was the answer to GPUs spending over half their time idle, and HBF is likely the next evolutionary step. In a $54.6 billion market, the competitive battleground between SK hynix and Samsung is not about making chips faster — it is about minimizing how long the GPU has to wait.
Professional Takeaway. The fact that the critical bottleneck in AI infrastructure is memory, not GPUs, connects to three actionable insights. First, for semiconductor industry professionals, packaging and stacking expertise may carry greater career value than design or process skills going forward. Second, for investors evaluating AI beneficiaries, the focus should expand beyond GPU makers to include memory companies’ HBM-to-HBF transition roadmaps. Third, for IT infrastructure planners, understanding how memory bandwidth affects total cost of ownership (TCO) in data center design is immediately applicable knowledge.
Disclaimer: This analysis is for educational and informational purposes only and does not constitute investment advice. Investment decisions should be made at the reader’s own discretion and responsibility.
Supply Chain Risk Checkpoint
With China controlling 98% of global gallium production, the US-China semiconductor war is expanding to rare resource export controls. Korea’s rare metal reserve (56.8 days) is 1/3 of Japan’s (up to 180 days). To fully benefit from the HBM supercycle, raw material supply chain diversification must come first.
Related Articles
- The Real Bottleneck in the AI Infrastructure War: Power and Semiconductors in the $1 Trillion CapEx Era — Analysis of power and semiconductor bottlenecks in AI infrastructure investment
- AI Infrastructure Peer Comparison: 12 Companies, Part 1 — Peer Groups and Multiples — From NVIDIA to Hanmi Semiconductor
- AI Infrastructure Peer Comparison: 12 Companies, Part 3 — Sector Trends and Investment — AI infrastructure sector investment strategy
References
- arXiv:2503.08311 — Analysis of memory access latency in LLM inference attention kernels
- Global Economic — Samsung and SK hynix begin simultaneous HBM4 mass production
- The Elec — HBM4 operating speed 11.7Gbps, 46% above JEDEC standard
- Aju Economy — HBM4 unit price ~$700, quarterly operating profit KRW 30T target
- Astute Group — Q3 2025 HBM market share
- TrendForce — SK hynix to supply ~70% of NVIDIA Rubin HBM4
- CNBC — SK hynix surpasses Samsung in annual profit for the first time
- TrendForce — HBF power efficiency 2.69x improvement simulation
- Blocks and Files — SK hynix H3 hybrid architecture
- SanDisk — SK hynix and SanDisk HBF co-development and standardization
- ET News — HBM4 femtosecond laser grooving technology adoption
- SK hynix Newsroom / BofA — 2026 HBM market projection: $54.6 billion
- Ministry of Trade, Industry and Energy — South Korea strategic mineral reserves: 56.8 days average
- IEA — Japan high-risk mineral stockpile: up to 180 days
Frequently Asked Questions (FAQ)
Q1. What is HBM and why does it matter for AI?
HBM (High Bandwidth Memory) is memory that sits directly adjacent to GPUs, dramatically increasing data transfer speeds. During LLM inference, over 50% of GPU cycles are stalled waiting for data — HBM was designed to solve this bottleneck.
Q2. How do SK hynix and Samsung differ in their HBF strategies?
SK hynix is pursuing co-development with SanDisk and pushing for industry standardization, targeting early 2027 mass production. Samsung is developing HBF independently. The divergent approaches reflect different bets on ecosystem building versus proprietary control.
Q3. What are the key market risks?
The 2026 HBM market is projected at $54.6 billion (+58% YoY), but the U.S.-China semiconductor war and rare earth export controls pose significant supply chain risks. China controls 98% of global gallium production, and South Korea’s strategic mineral reserves (56.8 days) are significantly below Japan’s (up to 180 days).
Frequently Asked Questions (FAQ)
Q1. HBM의 탄생과 HBM4 시대?
GPU에 딱 붙어서 데이터를 빠르게 넘겨주는 HBM(High Bandwidth Memory)이 나오게 됨.
Q2. 기업별 HBF 전략 — 협업 vs 단독?
HBF 개발에서 SK하이닉스와 삼성전자의 전략이 확연히 갈리고 있음.
Q3. 시장 전망과 리스크?
2026년 전체 HBM 시장 규모는 546억 달러(약 75조 원)로, 전년 대비 58% 성장할 것으로 전망됨 (BofA, SK하이닉스 뉴스룸).
Found this helpful?
☕ Buy me a coffee