Researchers Weigh New Benchmarks For Green500 Amid Shifting Workload Priorities

SC23 Is it time for the Green500 to expand its scope to account for more diverse workloads? This was one of the questions attendees grappled with at SC23.

Similar to the Top500, which ranks systems based on sheer performance, the Green500 weighs that performance against a system's power consumption in terms of gigaFLOPS per watt.

High Performance Linpack has been the gold standard for testing compute clusters for performance in exacting double-precision workloads. So, when the Green500 was launched in 2007 it made sense to use Linpack as the basis for evaluating the efficiency of these systems.

The problem is Linpack is only one benchmark and it isn't representative of all workloads. This is why we've seen benchmarks like High Performance Conjugate Gradient (HPCG) and HPL-MxP — formerly HPL-AI — crop up over the years to provide additional context for both traditional double-precision and mixed precision workloads.

But while we've found new ways to benchmark supercomputers in terms of performance, the Green500 remains tied to Linpack. That, however, may not be the case for much longer.

Trying an alternative approach

Over the past 18-24 months there has been a growing movement to broaden the scope of the Green500 to alternative workloads, Wu-chun Feng explained during a presentation at SC23.

"Mike Heroux and Jack Dongarra in particular have broached this subject about looking at the Green500 using HPCG," he explained. "Satoshi Matsuoka has been talking about 'all these benchmarks are important can we come up with some type of composite Green500 number of FLOPS per watt by somehow combining the numbers we get from the different benchmarks'."

Some of the early testing of HPCG is designed to more closely reflect real world performance in a wide variety of HPC workloads. If you take a look at HPCG performance, the scores are substantially lower than you'd expect to see from Linpack. In fact Japan's Fugaku comes in fourth in Linpack but first in the HPCG benchmark, beating out Frontier.

It's important to remember that for the Green500, the workload - whether its Linpack or HPCG - is there just as much to measure power consumption as it is to measure performance. Different benchmarks are going to utilize the infrastructure like accelerators and network fabrics to different degrees. As such, testing methodologies may need to be adjusted to accommodate alternative workloads.

While complex, Feng noted that "HPCG presents an opportunity to innovate from a software perspective in order to deliver energy efficiency."

While Feng didn't touch on HPL-MxP in much detail, there also appears to be an opportunity to address workloads that can take advantage of lower-precision floating point calculations to achieve a speedup compared to your typical FP64 application.

Looking at modern accelerators, it's not hard to see why. Nvidia's H100, for instance, sports up to 67 teraFLOPS of FP64, but drop down to FP8 and you're looking at 2 PFLOPS and roughly 4 PFLOPS with sparsity enabled.

Scientists at the University of Bristol have demonstrated the advantages of running climate models at half precision. But, the biggest beneficiary of lower precision is undoubtedly AI training and inference, especially for models that take advantage of sparsity.

As such, it's not hard to imagine a system that's incredibly efficient in mixed-precision workloads but performs rather poorly in HPC benchmarks. But just like HPCG, incorporating HPL-MxP into the Green500 ranking will likely require new testing methodology.

Henri maintains its lead over Green500

Despite the excitement surrounding Aurora's arrival on the Top500 ranking of supercomputers, there weren't nearly as many surprises with regard to this fall's Green500.

The Flatiron Institute's two petaFLOP Henri system retained its top spot. The 31-kilowatt Lenovo ThinkSystem cluster managed to squeeze 65 gigaFLOPS per watt from its 5920 Nvidia H100 and Ice Lake Xeon cores.

With that said two systems have moved into the top 10 most efficient supers. This included EuroHPC's MareNostrum 5 ACC which in addition to claiming the number eight spot on the Top500 managed to displace frontier for sixth place on the Green500.

Built by Eviden, the system features a similar arrangement as Henri, pairing Nvidia's H100s with Intel's newer 4th-Gen Xeon Scalable processors. The system managed to achieve 54 gigaFLOPS per watt of efficiency in the test.

South Korea's Olaf system was the other new system to break into the upper echelon of the Green500, claiming the number ten spot at 45 gigaFLOPS per watt.

Olaf is another Lenovo ThinkSystem machine, but instead of Intel's CPUs it pairs Nvidia H100 GPUs with AMD's 32 core Eypc Genoa processors. ®

RECENT NEWS

From Chip War To Cloud War: The Next Frontier In Global Tech Competition

The global chip war, characterized by intense competition among nations and corporations for supremacy in semiconductor ... Read more

The High Stakes Of Tech Regulation: Security Risks And Market Dynamics

The influence of tech giants in the global economy continues to grow, raising crucial questions about how to balance sec... Read more

The Tyranny Of Instagram Interiors: Why It's Time To Break Free From Algorithm-Driven Aesthetics

Instagram has become a dominant force in shaping interior design trends, offering a seemingly endless stream of inspirat... Read more

The Data Crunch In AI: Strategies For Sustainability

Exploring solutions to the imminent exhaustion of internet data for AI training.As the artificial intelligence (AI) indu... Read more

Google Abandons Four-Year Effort To Remove Cookies From Chrome Browser

After four years of dedicated effort, Google has decided to abandon its plan to remove third-party cookies from its Chro... Read more

LinkedIn Embraces AI And Gamification To Drive User Engagement And Revenue

In an effort to tackle slowing revenue growth and enhance user engagement, LinkedIn is turning to artificial intelligenc... Read more