GCC 12 Compiler Lands A Last Minute AMD Zen 3 Tuning Tweak
Jan Hubicka of SUSE who has been responsible for much of AMD's Zen tuning work for the GNU Compiler Collection landed the minor tweak this week to Znver3.
The Znver3 change this week is disabling gather instructions for vectors with 2 or 4 elements. This apparently came as a result of profiling on Zen 3 in weighing the benefits of the behavior. However, no benchmark figures were shared as part of the code commit. It's unlikely to have any significant real-world performance impact. The gather instruction usage has yielded mixed results across various CPU microarchitectures -- last year as an "optimization" patch, Hubicka originally enabled the gather instruction use for Zen 3 due to helping some benchmarks, but now it turns out to not always be the case thus the disabling in some cases.
While GCC 12 is the annual feature release to this prominent open-source compiler, there isn't much in the way of additional Zen 3 work in this release. AMD by way of SUSE squeezed the Znver3 tuning into GCC 11 just weeks ahead of that compiler release last year. That tuning work was held up until after the EPYC 7003 series were introduced, months after the original Ryzen 5000 series introduction. Since that initial work on Znver3, there hasn't been much for GCC 12. In fact, just one new patch for GCC 12 besides this week's gather disabling.
All of the "znver3" Zen 3 commits to the GCC codebase. No Znver4 commits yet.
As mentioned already, there isn't any AMD Zen 4 (znver4) target/tuning for GCC 12. While long ago AMD pushed out compiler tuning patches ahead of CPU releases, that hasn't been the case recently and they seem content primarily focused on optimized AMD CPU compiler support with their in-house LLVM/Clang-downstream: the AMD Optimizing C/C++ Compiler (AOCC). Zen 4 CPUs will begin shipping later this year while sadly now will need to wait for GCC 13 next year (or back-ported to a later GCC 12 point release, which aren't picked up too quickly by various distributions) for out-of-the-box Zen 4 tuning support.
In comparison, Intel landed its initial Alder Lake and Sapphire Rapids support into GCC back in 2020 and has continued working on the code. Intel's timely compiler enablement for both GCC and LLVM/Clang ahead of launch is one of many things to love about their open-source/Linux software support. This has been an ongoing tradition of theirs for many years.
Intel meanwhile has been working on their GCC compiler support in the open for Sapphire Rapids and Alder Lake since mid-2020 and as a result GCC 11/12 already have the "sapphirerapids" and "alderlake" targets with tuning ready to go in released compilers ahead of launch and those compilers being found by default in modern Linux distributions.