AOCC 4.0 Shows The Strong Advantages Of Compiler Optimizations With 4th Gen AMD EPYC CPUs
Last month when AMD launched the EPYC 9004 "Genoa" series they also published AOCC 4.0 as the newest version of the AMD Optimizing C/C++ Compiler derived from LLVM/Clang and tailored to their latest Zen microarchitecture. At the time I ran some AOCC 4.0 benchmarks on the Ryzen 7000 series and compared it to GCC and Clang. Since then I've had the time on my Genoa test rig to look at how well AOCC 4.0 is performing and in this article are some benchmarks with the EPYC 9374F processors between GCC and AOCC 4.0.
Today's benchmarking festivities are providing an initial gauge of AOCC 4.0 performance on AMD 4th Gen EPYC against GCC 12.2 as is the current stable GNU Compiler Collection and what is shipped currently by Ubuntu 22.10, Fedora 37, and other current Linux distributions. This round of compiler testing was using the AMD EPYC 9374F 2P configuration for that 32-core per socket frequency optimized Zen 4 SKU.
Ubuntu 22.10 was running on this AMD Titanite reference server with the two EPYC 9374F processors for providing the latest GCC 12 stable compiler out-of-the-box and using the stock Linux 5.19 kernel. Ubuntu 22.04 LTS meanwhile is on the even older GCC 11 compiler by default. Thanks to AMD for providing the review hardware under test. Along with benchmarking GCC 12.2 stable, AOCC 4.0 as obtained from AMD.com was also tested for seeing how well AMD's Zen 4 tuned compiler is performing on this same EPYC high performance server across a range of C and C++ open-source workloads.
AOCC 4.0 can be downloaded for free at developer.amd.com.
When testing both GCC and AOCC, the same CFLAGS/CXXFLAGS were maintained and no other changes besides swapping out the underlying compiler being used for building out all of these open-source applications and software under test.
So far the upstream GCC Zen 4 (Znver4) work is limited but there are some tuning patches on the way.
A wider range of compiler tests on EPYC 4th Gen will come as the upstream open-source compilers begin seeing substantive Zen 4 optimizations. In October the initial Znver4 target was added to GCC 13 with enabling the new CPU instructions but carrying over the Zen 3 cost table and lacking any extra tuning. Just as of earlier this month SUSE has taken to working on Znver4 tuning for GCC but as of writing those patches have yet to be merged. Once that initial round of tuning wraps up for GCC 13 will likely be an interesting point of comparison against AOCC.
In contrast of strategies between competitors, here is a look at Intel's Sapphire Rapids enablement work in GCC: that code began landing in 2020 and seeing improvements over time while still ahead of release.
On the upstream LLVM/Clang side, at the end of November AMD posted an initial Znver4 patch. As of writing that patch is still undergoing review but it looks like it will be merged soon. However, like the initial GCC patch, that LLVM enablement patch isn't yet fully tuned for Znver4 and carrying over from the Znver3 compiler code. No other Znver4 patches are currently in LLVM's review queue.
So once that upstream GCC and LLVM/Clang work for Zen 4 is in better standing it should prove to be a much more interesting and competitive compiler comparison on Zen 4. As I've preached for years, it's too bad that the upstream compiler support isn't squared away sooner -- like pre-launch as is generally the case with Intel and at times having their compiler enablement work sorted a year or two ahead of launch so there is plenty of time for it to appear in released and shipping compiler versions.
Thus today is a simple look at the AMD EPYC Genoa performance uplift possibilities over using the current GCC stable compiler on Ubuntu Linux for this EPYC 9374F 2P server running Ubuntu Linux.