AMD AOCC 4.0 Arrives For Squeezing More Performance Out Of Zen 4
On Thursday when launching AMD 4th Gen EPYC Genoa processors, AMD also published AOCC 4.0 as the newest version of the AMD Optimizing C/C++ Compiler. I've been putting it through its paces the past day and continues showing the positive performance impact of proper compiler tuning.
The AMD Optimizing C/C++ Compiler 4.0 release most notably introduces initial Zen 4 "znver4" support and optimizations. The public change-log simply calls it up as, "AMD Family 19h processors (AMD “Zen4” core architecture) support and optimizations." It's not clear from that the extent of the Zen 4 optimizations at this stage, particularly with AOCC being a closed-source downstream of LLVM/Clang.
Meanwhile in upstream LLVM/Clang the Zen 4 (znver4) target has yet to be added or any other Zen 4 specific optimizations. As noted in prior Phoronix articles, the Zen 4 tuning for GCC 13 only landed at the end of October while currently using the same cost/tuning table as Zen 3. The GCC 13 stable release won't be out until March~April so there is still time for some improvements to land. With time AMD is expected to provide more optimized compiler support for GCC and LLVM/Clang while my thoughts on the matter are outlined more in yesterday's EPYC 9554/9654 Linux review conclusion. As it stands right now, using AOCC 4.0 is where to go if wanting the best optimized compiler for targeting Zen 4 processors.
In addition to AOCC 4.0 having Zen 4 support and optimizations, there are OpenMP 4.5 support improvements for Fortran, debugging/diagnostics improvements, tuning for the AMD Math Library 4.0, support for vector and faster lib variants of the AMD Math Library, and improved variants of various scalar/vector/loop transformations. More details on AOCC 4.0 via developer.amd.com.
No pending Zen 4 (znver4) patches yet for upstream LLVM.
AOCC 4.0 is derived from the upstream LLVM/Clang 14.0.6 sources. LLVM 15 released in early September while AOCC hasn't yet been re-based to that latest half-year feature release. For some preliminary benchmarking I thus ran some comparison benchmarks on Zen 4 between AOCC 4.0 and LLVM Clang 14.0 as packaged on Ubuntu Linux.
Due to the Titanite server being my lone EPYC 9004 series testing platform for now and that being busy carrying out other benchmarks, for trying out AOCC 4.0 I have been running benchmarks on the AMD Ryzen 9 7950X. AOCC 4.0 supports all the Ryzen / Threadripper / EPYC products and still retains support for earlier generations of Zen processors.
During testing all the CFLAGS/CXXFLAGS were maintained the same but just rebuilding all of the software under test with AMD AOCC 4.0 or LLVM Clang 14.0 upstream.
As upstream LLVM/Clang Git and GCC Git pick up more optimizations for Zen 4, I'll of course run a much larger and broader compiler comparison on Zen 4 -- plus running on the more relevant EPYC 9004 series processors too.