NVIDIA GH200 72 Core Grace CPU Performance vs. AMD Ryzen Threadripper Workstations

Written by Michael Larabel in Computers on 20 February 2024 at 10:55 AM EST. Page 2 of 5. 29 Comments.
NAS Parallel Benchmarks benchmark with settings of Test / Class: CG.C. HP Z6 G5 A - Threadripper PRO 7995WX was the fastest.
NAS Parallel Benchmarks benchmark with settings of Test / Class: FT.C. HP Z6 G5 A - Threadripper PRO 7995WX was the fastest.

Kicking off these CPU benchmarks was the NASA NPB HPC benchmarks for seeing how the CPU performance compares for the NVIDIA Grace CPU up against the AMD Ryzen Threadripper Zen 4 workstations. In some of these HPC benchmarks the 72-core GH200 CPU performance was quite similar to the 64-core Threadripper 7980X while the 96-core / 192-thread Threadripper PRO 7995WX led over those two in most of the NPB benchmarks.

NAS Parallel Benchmarks benchmark with settings of Test / Class: BT.C. HP Z6 G5 A - Threadripper PRO 7995WX was the fastest.
NAS Parallel Benchmarks benchmark with settings of Test / Class: IS.D. HP Z6 G5 A - Threadripper PRO 7995WX was the fastest.
NAS Parallel Benchmarks benchmark with settings of Test / Class: LU.C. HP Z6 G5 A - Threadripper PRO 7995WX was the fastest.

In some of these NPB benchmarks though they were less optimized for AArch64 than x86_64 that led to strong advantages for the Threadripper workstations.

Rodinia benchmark with settings of Test: OpenMP LavaMD. HP Z6 G5 A - Threadripper PRO 7995WX was the fastest.

To reiterate again, all of the benchmarks in this article across all of the systems is focusing on the CPU/system (non-GPU) performance.

Algebraic Multi-Grid Benchmark benchmark with settings of . GPTshop.ai - NVIDIA GH200 was the fastest.

For memory-intensive benchmarks like AMG, the GPTshop.ai GH200 workstation was able to outperform HP's flagship Z6 G5 A workstation. The Threadripper PRO 7995WX with eight DDR5 memory channels was already a big uplift over the 64-core Threadripper 7980X with four DDR5 memory channels, but that Threadripper PRO couldn't match the GH200 with the high performance memory.

Algebraic Multi-Grid Benchmark benchmark with settings of . GPTshop.ai - NVIDIA GH200 was the fastest.

On a performance-per-dollar basis for the systems as configured though, the Threadripper workstations were delivering better value.

NWChem benchmark with settings of Input: C240 Buckyball. GPTshop.ai - NVIDIA GH200 was the fastest.

The NWChem computational chemistry software was another workload benefiting well from the memory bandwidth with the Grace CPU.

Xcompact3d Incompact3d benchmark with settings of Input: X3D-benchmarking input.i3d. GPTshop.ai - NVIDIA GH200 was the fastest.

Or Incompact3D also loves memory bandwidth and allowed a strong first place finish with the GH200 workstation.

Xcompact3d Incompact3d benchmark with settings of Input: X3D-benchmarking input.i3d. GPTshop.ai - NVIDIA GH200 was the fastest.

In the case of this finite difference HPC code, the performance advantage of GH200 was strong enough that the GPTshop.ai workstation was delivering the best performance per dollar of these workstations as tested.


Related Articles