LLVM Clang Shows Off Great Performance Advantage On NVIDIA GH200's Neoverse-V2 Cores

Written by Michael Larabel in Software on 18 March 2024 at 11:20 AM EDT. Page 1 of 4. 14 Comments.

With my recent NVIDIA GH200 Grace CPU benchmarks carried out remotely via GPTshop.ai, besides looking at areas like the 64K kernel page size performance benefits I also ran some fresh benchmarks looking at the performance difference when the binaries were generated by LLVM Clang rather than the default GCC compiler on Ubuntu Linux. This article shows off the performance difference for the 72-core Neoverse-V2 server/HPC processor when leveraging LLVM Clang rather than the GNU Compiler Collection.

LLVM logo

This round of tests is some straight-forward compiler benchmarks that were carried out last month on the GPTshop.ai GH200 server. Given the compiler focus, a variety of CPU workloads were tested when built using the default GCC 13.2 compiler employed by Ubuntu 23.10 AArch64 against the LLVM Clang 17.0.2 compiler available via the Ubuntu 23.10 archive. The same compiler flags were used throughout testing both of these compiler options on Ubuntu 23.10 Linux with this high performance ARM64 server.

NVIDIA GH200 Compilers

These benchmarks are mainly done for reference and curiosity purposes for how the AArch64 performance is looking for Clang-generated binaries compared to GCC that typically is the default compiler on most Linux distributions. I've done many x86_64 Clang benchmarks for those interested given my abundance of Intel and AMD processors albeit not as much AArch64 hardware around so with the GH200 it was an interesting time to revisit the compiler performance comparison. Thanks to GPTshop.ai for having made this NVIDIA ARM64 server remotely available for benchmarking.

Related Articles