Microsoft Unveils Beefy Custom AMD Chip To Crunch HPC Workloads On Azure
Ignite One of the advantages of being a megacorp is that you can customize the silicon that underpins your infrastructure, as Microsoft is demonstrating at this week's Ignite conference in Chicago.
Redmond is bringing to its Azure cloud platform a custom hardware security module (HSM) and its own data processing unit (DPU), plus an intriguing custom AMD processor to power virtual machine instances targeting high-performance computing (HPC) workloads.
Described as Microsoft's latest advance in CPU-based supercomputing, the Azure HBv5 virtual machine is powered by custom AMD Epyc 9V64H processors. These are based on Zen 4 CPU cores rather than the latest Zen 5 technology, at up to 4 GHz peak frequency.
Unlike most VM instances, which typically share a processor with others, the Azure HBv5 will be spread across four Epyc 9V64H processors, for up to 352 cores and up to 9 GB of memory per core, supporting 6.9 TBps of memory bandwidth across 400-450 GB of HBM3 memory.
Microsoft claims this memory bandwidth is up to 8x that of the latest bare-metal or virtual machine instances available on rival platforms. Hence the firm is pitching HBv5 at the most memory-constrained HPC applications, such as computational fluid dynamics, automotive and aerospace simulation, weather modeling, energy research, molecular dynamics, and computer-aided engineering.
Each instance also gets a 14 TB local NVMe SSD, said to be capable of up to 50 GBps read and 30 GBps write bandwidth, and 800 Gbps of Nvidia Quantum-2 InfiniBand networking.
One intriguing fact Redmond disclosed is that the cluster of custom chips making up each HBv5 instance will have twice the total Infinity Fabric bandwidth between them as "any AMD Epyc server platform to date."
This led some on The Reg systems desk to suspect that the Epyc 9V64H may actually be a version of AMD's MI300A APU chip, but with all CPUs rather than a mix of GPU and CPU cores. We asked Microsoft for more details and will report back if we hear any more.
However, Azure HBv5 instances aren't even available as a technology preview yet. Anyone interested can sign up for access to the preview, which is set to start in the first half of 2025, Microsoft said.
- Microsoft Fabric promises transactions, analytics on one database service
- Microsoft goes thin client with $349 Windows 365 Link mini PC
- Database warhorse SQL Server 2025 goes all-in on AI
- Microsoft unleashes autonomous Copilot AI agents in public preview
Azure is also getting Microsoft's first in-house DPU, the imaginatively named Azure Boost DPU. As Reg readers will know, this is basically a programmable chip designed to offload network and/or storage processing from the host CPUs in a datacenter server.
This is based on tech that the cloud colossus gained from its acquisition of Fungible last year, and integrates high-speed Ethernet and PCIe interfaces along with network and storage engines, data accelerators, and security features, into a fully programmable system-on-chip.
"Built specifically for the Azure infrastructure, Azure Boost DPU is a hardware-software co-design that runs a custom, lightweight data-flow operating system to enable agile platforms with higher performance, lower power consumption, and enhanced efficiency compared to traditional implementations," said Corporate VP of Silicon Pradeep Sindhu, former co-founder and CEO at Fungible.
Another piece of custom silicon is Azure Integrated HSM. This type of chip is a dedicated hardware security component that performs encryption/decryption, and keeps the associated keys securely stored on the chip itself.
This kind of resource is not new, and cloud platforms, including Azure, already feature them. However, Microsoft says that Azure Integrated HSM eliminates the latency of network round-trips to remote HSM services, or seeking the release of keys from those remote HSMs.
"As a server-local HSM that securely binds to the workload environments, Azure Integrated HSM provides locally attached HSM services to both confidential and general-purpose virtual machines and containers. This provides the benefit of industry-leading in-use key protection without the latency drawbacks of round-trip network-attached HSM calls," explained chief technology officer Mark Russinovich, on a blog announcing the new silicon.
Starting next year, an Azure Integrated HSM will be part of every new server deployed on Azure, Microsoft said. ®
From Chip War To Cloud War: The Next Frontier In Global Tech Competition
The global chip war, characterized by intense competition among nations and corporations for supremacy in semiconductor ... Read more
The High Stakes Of Tech Regulation: Security Risks And Market Dynamics
The influence of tech giants in the global economy continues to grow, raising crucial questions about how to balance sec... Read more
The Tyranny Of Instagram Interiors: Why It's Time To Break Free From Algorithm-Driven Aesthetics
Instagram has become a dominant force in shaping interior design trends, offering a seemingly endless stream of inspirat... Read more
The Data Crunch In AI: Strategies For Sustainability
Exploring solutions to the imminent exhaustion of internet data for AI training.As the artificial intelligence (AI) indu... Read more
Google Abandons Four-Year Effort To Remove Cookies From Chrome Browser
After four years of dedicated effort, Google has decided to abandon its plan to remove third-party cookies from its Chro... Read more
LinkedIn Embraces AI And Gamification To Drive User Engagement And Revenue
In an effort to tackle slowing revenue growth and enhance user engagement, LinkedIn is turning to artificial intelligenc... Read more