Oracle Boasts Zettascale 'AI Supercomputer,' Just Dont Ask About Precision
Comment Oracle says it's already taking orders on a 2.4 zettaFLOPS cluster with "three times as many GPUs as the Frontier supercomputer."
But let's get precise about precision: Oracle hasn't actually managed a 2,000x performance boost over the United States' top-ranked supercomputer — those are "AI zettaFLOPS" and the tiniest, sparsest ones Nvidia's chips can muster versus the standard 64-bit way of measuring super performance.
This might be common knowledge for most of our readers, but it seems Oracle needs a reminder: FLOPS are pretty much meaningless unless they're accompanied by a unit of measurement. In scientific computing, FLOPS are usually measured at double precision or 64-bit.
However, for the kind of fuzzy math that makes up modern AI algorithms, we can usually get away with far lower precision, down to 16-bit, 8-bit, or in the case of Oracle's new Blackwell cluster, 4-bit floating points.
The 131,072 Blackwell accelerators that make up Oracle's "AI supercomputer" are in fact capable of churning out 2.4 zettaFLOPS of sparse FP4 compute, but the claim is more marketing than reality.
That's because most models today are still trained at 16-bit floating point or brain float precision. But at that precision, Oracle's cluster only manages about 459 exaFLOPS of AI compute, which just doesn't have the same ring to it as "zettascale".
Note that there's nothing technically stopping you from training models at FP8 or even FP4, but doing so comes at the cost of accuracy. Instead, these lower precisions are more commonly used to speed up the inferencing of quantized models, a scenario where you'll pretty much never need all 131,072 chips even when serving up a multi-trillion parameter model.
- Oracle to power 1GW datacenter with trio of tiny nuclear reactors
- SambaNova makes Llama gallop in inference cloud debut
- We're in the brute force phase of AI – once it ends, demand for GPUs will too
- DoE drops $23M in effort to reinvigorate supercomputing
What's funny is that if Oracle can actually network all those GPUs together with RoCEv2 or InfiniBand, we're still talking about a pretty beefy HPC cluster. At FP64, the peak performance of Oracle's Blackwell Supercluster is between 5.2 and 5.9 exaFLOPS, depending on whether we're talking about the B200 or GB200. That's more than 3x that of the AMD Frontier system's peak performance.
Interconnect overheats being what they are, we'll note that even getting close to peak performance as these scales is next to impossible.
Oracle already offers H100 and H200 superclusters capable of scaling to 16,384 and 65,536 GPUs, respectively. Blackwell-based superclusters, including Nvidia's flagship GB200 NVL72 rack-scale systems, will be available beginning in the first half of 2025. ®
From Chip War To Cloud War: The Next Frontier In Global Tech Competition
The global chip war, characterized by intense competition among nations and corporations for supremacy in semiconductor ... Read more
The High Stakes Of Tech Regulation: Security Risks And Market Dynamics
The influence of tech giants in the global economy continues to grow, raising crucial questions about how to balance sec... Read more
The Tyranny Of Instagram Interiors: Why It's Time To Break Free From Algorithm-Driven Aesthetics
Instagram has become a dominant force in shaping interior design trends, offering a seemingly endless stream of inspirat... Read more
The Data Crunch In AI: Strategies For Sustainability
Exploring solutions to the imminent exhaustion of internet data for AI training.As the artificial intelligence (AI) indu... Read more
Google Abandons Four-Year Effort To Remove Cookies From Chrome Browser
After four years of dedicated effort, Google has decided to abandon its plan to remove third-party cookies from its Chro... Read more
LinkedIn Embraces AI And Gamification To Drive User Engagement And Revenue
In an effort to tackle slowing revenue growth and enhance user engagement, LinkedIn is turning to artificial intelligenc... Read more