Samsung Announces 'Shinebolt' HBM3E Memory: HBM Hits 36GB Stacks at 9.8 Gbps

Trending 1 month ago

Samsung’s yearly Memory Tech Day is taking spot successful San Jose this morning, and arsenic portion of nan event, nan institution is making a mates of notable representation exertion announcements/disclosures. The item of Samsung’s arena is nan preamble of Shinebolt, Samsung’s HBM3E representation that will group caller marks for some representation bandwidth and representation capacity for high-end processors. The institution is besides disclosing a spot much connected their GDDR7 memory, which will people a important technological update to nan GDDR family of representation standards.

Starting things off, we person today’s marquee announcement: Shinebolt HBM3E memory. Like nan remainder of nan representation industry, Samsung is preparing a successor to nan current-generation of HBM3 representation that’s being utilized pinch high-end/HPC-grade processors, pinch nan manufacture settling astir nan upcoming HBM3E standard. HBM3E is slated to connection some importantly higher capacities and greater representation bandwidth than HBM3, helping nan high-bandwidth representation exertion support up pinch ever-growing workloads connected high-end processors.

Samsung HBM Memory Generations
Max Capacity 36GB 24 GB 16 GB 8 GB
Max Bandwidth Per Pin 9.8 Gb/s 6.4 Gb/s 3.6 Gb/s 2.0 Gb/s
Number of DRAM ICs per Stack 12 12 8 8
Effective Bus Width 1024-bit
Voltage ? 1.1 V 1.2 V 1.2 V
Bandwidth per Stack 1.225 TB/s 819.2 GB/s 460.8 GB/s 256 GB/s

The ground of Shinebolt is simply a caller 24Gbit HBM representation die, which Samsung will beryllium producing connected their D1a process, nan company’s EUV-based 4th procreation 10nm-class (14nm) node. Samsung will beryllium producing some 8Hi and yet 12Hi stacks based connected this caller die, allowing for full stack capacities of 24GB and 36GB respectively, 50% much capacity than their HBM3 (Icebolt) equivalents.

According to Samsung, Shinebolt will beryllium capable to deed representation clockspeeds arsenic precocious arsenic 9.8Gbps/pin, amended than 50% faster than their HBM3 products. Though fixed immoderate of Samsung’s erstwhile representation clockspeed claims, there’s a bully chance this is simply a semi-overclocked state. Shinebolt improvement isn’t acold capable on for Samsung to database individual SKUs, but moreover astatine nan blimpish extremity of things, Samsung is promoting information rates of astatine slightest 8Gbps/pin successful their arena property release. And if Samsung’s eager representation frequencies do travel to fruition, past that would put Samsung up of their title arsenic well; to date, SK hynix and Micron person announced plans for 9Gbps/pin and 9.2Gbps/pin representation respectively, truthful Samsung’s claims are surely nan astir aggressive.

Overall, these clockspeeds would springiness a azygous HBM3E stack a minimum bandwidth of 1TB/sec, and a maximum bandwidth of 1.225TB/sec, good up of nan 819GB/sec information complaint of HBM3. Or to framework things successful reference of a high-end processor (e.g. NVIDIA H100), a 6-stack spot would beryllium capable to entree arsenic overmuch arsenic 216GB of representation pinch an aggregate representation bandwidth arsenic precocious arsenic 7.35TB/sec.

As for powerfulness efficiency, things look to beryllium a spot of a mixed bag. On a comparative basis, Samsung says that Shinebolt will beryllium 10% much businesslike than Icebolt – successful different words, consuming 10% little powerfulness per spot transferred (pJ/bit). However, a 25%+ clockspeed betterment will much than swipe retired those gains owed to nan important summation successful bits transferred. So while Shinebolt will beryllium much businesslike overall, connected an absolute ground it seems that full powerfulness depletion for HBM representation will proceed to turn pinch nan adjacent generation.

Either way, for nan high-end processor marketplace that Samsung is targeting pinch Shinebolt, chipmakers are improbable to beryllium fazed by nan powerfulness increase. Like nan remainder of nan high-end processor space, Samsung has nan AI marketplace group quadrate successful its sights – a marketplace conception wherever some representation bandwidth and representation capacity are limiting factors, particularly pinch monolithic ample connection models (LLMs). Along pinch nan accepted supercomputer and networking marketplace segments, Samsung should person small problem trading faster HBM successful nan mediate of a booming AI market.

Like nan different awesome representation vendors, Samsung expects to vessel Shinebolt astatine immoderate constituent successful 2024. Given that nan institution conscionable started sampling nan representation – and that HBM3 Icebolt itself just deed wide accumulation – Shinebolt’s apt not shipping until nan later portion of nan year.

A Brief Teaser connected HBM4: FinFETs & Copper-to-Copper Bonding

Finally, looking moreover farther into nan future, Samsung is concisely talking astir their plans for HBM4 memory. While that exertion is still a fewer years disconnected (there’s not moreover an approved specification for it yet), we cognize from previous disclosures that nan representation manufacture is aiming to move to a wider, 2048-bit representation interface. Which, arsenic Samsung likes to framework things, is nan only applicable prime erstwhile further HBM clockspeed increases would rustle retired powerfulness consumption.

For HBM4, Samsung is looking astatine employing much precocious fab and packaging technologies that are presently nan domain of logic chips. On nan fab broadside of matters, nan institution wants to move to utilizing FinFET transistors for their memory, arsenic opposed to nan planar transistors still utilized there. As pinch logic, FinFETs would trim nan thrust existent required, which would thief to amended DRAM power efficiency. Meanwhile connected nan packaging broadside of matters, Samsung is looking astatine moving from micro-bump bonding to bumpless (direct copper-to-copper) bonding, a packing method that’s still connected nan cutting-edge of improvement moreover successful nan logic space. Embracing cutting-edge technologies will beryllium captious to keeping HBM bandwidth increasing arsenic it has complete nan past decade, but nan costs and complexities of doing truthful besides underscore why HBM remains an exclusively niche high-end representation technology.

GDDR7 Update: 50% Lower Stand-By Power Than GDDR6

Besides HBM3E, Samsung’s different large bandwidth representation update of nan time is simply a little position update connected their GDDR7 memory.

Back successful July of this year, Samsung announced that they completed first improvement connected their GDDR7 memory. The adjacent procreation of GDDR memory, GDDR7 brings pinch it respective awesome changes versus today’s GDDR6, nan astir important of which is simply a move to PAM3 encoding. PAM3 allows for 1.5 bits to beryllium transferred per rhythm (or alternatively 3 bits complete 2 cycles), opening nan doorway to improving representation transportation rates without employing much costly intends of further improving nan wave of nan representation bus.

GDDR Memory Generations
B/W Per Pin 32 Gbps (Projected) 24 Gbps (Shipping) 24 Gbps (Sampling)
Chip Density 2 GB (16 Gb) 2 GB (16 Gb) 2 GB (16 Gb)
Total B/W (256-bit bus) 1024 GB/sec 768 GB/ssec 768 GB/ssec
DRAM Voltage 1.2 V 1.35 V 1.35 V
Signaling PAM-3 PAM-4 NRZ (Binary)
Packaging 266 FBGA 180 FBGA 180 FBGA

As a speedy recap from Samsung’s July announcement, Samsung will beryllium rolling retired 16Gbit (2GB) modules, which will beryllium capable to tally astatine up to 32Gbps/pin. That’s a 33% betterment successful bandwidth per pin complete existent GDDR6 memory, and would bring nan aggregate bandwidth of a 256-bit representation autobus to a cool 1TB/second. GDDR7 should besides present a 20% betterment successful powerfulness ratio complete Samsung’s GDDR6 (in position of pJ/bit), acknowledgment successful portion to nan usage of Samsung’s 3rd procreation D1z (10nm-class) fab node.

Today’s arena from Samsung is mostly a recap of July’s announcement, but successful nan process we person learned a mates of caller method specifications connected GDDR7 that Samsung hasn’t antecedently disclosed. First off, GDDR7 isn’t conscionable improving progressive powerfulness consumption, but nan tech will besides amended connected stand-by powerfulness depletion to a important degree. Thanks to further timepiece controls, GDDR7 will devour 50% little stand-by powerfulness than GDDR6.

Second, successful discussing why Samsung (and nan manufacture arsenic a whole) went pinch PAM3 encoding for GDDR7 alternatively of moreover denser PAM4, nan institution confirmed immoderate of our method suppositions connected nan caller technology. In short, PAM3 has a little mean spot correction complaint (BER) than PAM4, mostly acknowledgment to nan wider margins connected nan oculus window. None of which makes PAM4 unworkable (as Micron has already proven), but Samsung and nan remainder of nan representation manufacture are favoring nan comparative simplicity of PAM3, fixed nan trade-offs.

Besides nan accustomed video card/gaming customers, Samsung is expecting GDDR7 to beryllium adopted by AI spot makers, and possibly a spot much surprisingly, nan automotive industry. In truth immoderate of these non-traditional customers whitethorn beryllium nan first to adopt nan memory; since nan accepted GPU vendors are still mid-cycle connected their existent procreation of products, it will still beryllium rather immoderate clip earlier they vessel immoderate GDDR7-capable silicon.

At this constituent Samsung has not announced a projected day for erstwhile their GDDR7 representation will spell into wide production. But nan institution is still expecting that they will beryllium nan first vendor to vessel nan next-generation memory, presumably successful 2024.

Source Networking