NVIDIA Achieves Record Performance In Latest MLPerf Training Benchmarks

NVIDIA Achieves Record Performance in Latest MLPerf Training Benchmarks

The full-stack NVIDIA accelerated computing platform has once again demonstrated exceptional performance in the latest MLPerf Training v4.0 benchmarks, according to the NVIDIA Blog.

Unprecedented Performance in Large Language Models

NVIDIA more than tripled its performance on the large language model (LLM) benchmark, based on GPT-3 175B, compared to its previous record-setting submission. This feat was achieved using an AI supercomputer featuring 11,616 NVIDIA H100 Tensor Core GPUs connected with NVIDIA Quantum-2 InfiniBand networking, a significant increase from the 3,584 H100 GPUs used last year. This scalability showcases the extensive full-stack engineering efforts by NVIDIA.

The scalability of the NVIDIA AI platform enables faster training of massive AI models like GPT-3 175B, translating into significant business opportunities. For instance, NVIDIA's recent earnings call highlighted that LLM service providers could potentially turn a single dollar invested into seven dollars over four years by running the Llama 3 70B model on NVIDIA HGX H200 servers.

NVIDIA H200 GPU: Pushing Boundaries

The NVIDIA H200 Tensor GPU, built on the Hopper architecture, offers 141GB of HBM3 memory and over 40% more memory bandwidth compared to the H100 GPU. In its MLPerf Training debut, the H200 extended the H100’s performance by up to 47%, pushing the boundaries of AI training capabilities.

Software Optimizations Drive Performance Gains

NVIDIA also reported a 27% performance boost in its 512 H100 GPU configuration compared to the previous year, thanks to numerous software stack optimizations. This improvement underscores the impact of continuous software enhancements on performance, even with existing hardware.

The submission highlighted nearly perfect scaling, with performance increasing proportionally as the number of GPUs rose from 3,584 to 11,616.

Excellence in LLM Fine-Tuning

LLM fine-tuning, a critical workload for enterprises customizing pretrained large language models, was also a highlight. NVIDIA excelled in this area, scaling from eight to 1,024 GPUs and completing the benchmark in a record 1.5 minutes.

Accelerating Stable Diffusion and GNN Training

NVIDIA achieved up to an 80% increase in Stable Diffusion v2 training performance at the same system scales as the previous round. Additionally, the H200 GPU delivered a 47% boost in single-node graph neural network (GNN) training compared to the H100, demonstrating the powerful performance and efficiency of NVIDIA GPUs for various AI applications.

Broad Ecosystem Support

The breadth of the NVIDIA AI ecosystem was evident with 10 partners, including ASUS, Dell Technologies, and Lenovo, submitting their own impressive benchmark results. This widespread participation underscores the industry’s trust in NVIDIA’s AI platform.

MLCommons continues to play a vital role in AI computing by enabling peer-reviewed comparisons of AI and HPC platforms. This is crucial for guiding important purchasing decisions in a rapidly evolving field.

Looking ahead, the NVIDIA Blackwell platform promises next-level AI performance for trillion-parameter generative AI models, both in training and inference.



Image source: Shutterstock

. . .

Tags

RECENT NEWS

Ether Surges 16% Amid Speculation Of US ETF Approval

New York, USA – Ether, the second-largest cryptocurrency by market capitalization, experienced a significant surge of ... Read more

BlackRock And The Institutional Embrace Of Bitcoin

BlackRock’s strategic shift towards becoming the world’s largest Bitcoin fund marks a pivotal moment in the financia... Read more

Robinhood Faces Regulatory Scrutiny: SEC Threatens Lawsuit Over Crypto Business

Robinhood, the prominent retail brokerage platform, finds itself in the regulatory spotlight as the Securities and Excha... Read more

Ethereum Lags Behind Bitcoin But Is Expected To Reach $14K, Boosting RCOF To New High

Ethereum struggles to keep up with Bitcoin, but experts predict a rise to $14K, driving RCOF to new highs with AI tools.... Read more

Ripple Mints Another $10.5M RLUSD, Launch This Month?

Ripple has made notable progress in the rollout of its stablecoin, RLUSD, with a recent minting of 10.5… Read more

Bitcoin Miner MARA Acquires Another $551M BTC, Whats Next?

Bitcoin mining firm Marathon Digital Holdings (MARA) has announced a significant milestone in its BTC acquisition strate... Read more