Why it matters: Nvidia’s Grace Hopper GH200 is Nvidia’s response to the advanced AI and HPC processing demands in the enterprise and data centers. When announcing Grace Hopper last year, Nvidia’s CEO Jensen Huang said that the tech industry was hitting a hard wall with traditional architectures, which is why it’s been increasingly turning to GPUs and accelerated computing to solve complex computing tasks.
Nvidia’s DGX GH200 supercomputing platform packs 256 Grace Hopper GH200 superchips, where each chip combines a 72-core Arm-based Grace CPU and an H100 Tensor Core GPU. The chip also supports up to 480GB of LPDDR5 memory and 96GB of HBM3 or 144GB of HBM3e memory.
Those are impressive specs any way you slice them, and now Linux-focused publication Phoronix got their hands on a $40K+ system based on the GH200 for some benchmarking (testing the CPU part of the SoC) to see whether the GH200 can stand up to Nvidia’s claims.
This is Nvidia’s most powerful AI chip so far, designed for giant-scale AI and high-performance computing applications. The company claims it delivers up to ten times higher performance for applications running terabytes of data and that this boost in power enables scientists and researchers to create unprecedented solutions for complex problems.
From a corporate perspective, it serves as an alternative to x86 CPUs in the server segment, allowing Nvidia to provide a competitive product against AMD and Intel’s offerings. Nvidia has said that Grace CPU offers up to twice the performance of the Intel Sapphire Rapids and AMD Genoa CPUs at the same power and up to 3.5 times the efficiency of AMD’s last-gen Epyc Milan CPUs.
Phoronix conducted the CPU benchmarks in Linux, and the superchip proved competitive. For the HPCG benchmark, the GH200 performance came out just ahead of the Xeon Platinum 8380 2P and just shy of the Epyc 9654 Genoa 2P performance, which Phoronix declared was “not bad at all for an initial showing.” It also found that the GH200 was the fastest against the 2P Genoa(X) and Intel Emerald Rapids single processor configurations.
The GH200 tied with the AMD Epyc 9684X Genoa-X 1P processor for the Rodinia HPC benchmark with the LavaMD test case. For AMG, the superchip system nearly matched the Xeon Platinum 8380 2P for CPU performance. Phoronix called the processor’s performance “very impressive” with the NWChem computational chemistry software. The single GH200 nearly tied AMD EPYC Genoa in the leading 2P configuration.
Overall, the GH200 Grace CPU shows promising early potential even though there are workloads that are not fully optimized for AArch64. It can closely match the performance of the Intel Xeon Platinum 8592+ Emerald Rapids across various benchmarks. The GH200 had nearly twice the performance of the Ampere Altra Max 128-core Arm processor, too, showcasing the advancements in Nvidia’s Arm CPU since the early days of the Tegra SoC.
Intel and AMD configurations with higher core counts and dual socket setups, such as those available with Intel Xeon Emerald Rapids and AMD Epyc Genoa(X) / Bergamo can yield better results. Nvidia also offers a 144-core Grace Superchip version which was not part of this test.