GCC Compiler Sees New Patch With Tuning For AMD Zen 3 (Znver3)
Back in December the GCC 11 compiler picked up initial support for AMD Zen 3 (znver3). That support was just enabling the newly-supported CPU instructions found with this latest-generation Zen microarchitecture. Unfortunately missing at that time was any tuning for allowing the compiler to make more informed decisions over instruction scheduling with the instruction costs, etc. So it's been carrying the same data as Zen 2 that is also largely carried over from the original Zen compiler code.
Now today, a few weeks prior to the GCC 11 stable release, a new Zen 3 compiler patch has surfaced. Longtime GCC developer Jan Hubicka of SUSE has provided a "tuning part 1" patch for the GCC Znver3 code.
Hubicka tuned the Znver3 target based on benchmarks but overall he characterized the tuning as a smooth upgrade from Zen 2. The tuning includes adjusting some instructions that now have shorter latencies on Zen 3 CPUs, gather instructions are a lot faster, and FMADD was optimized with Zen 3.
That's all dandy and great to see but it's noted there are some performance regressions remaining and the instruction scheduler could still use some love with it still treating Zen as an in-order CPU, among other areas left for further tuning of the Znver3 optimized code path.
This initial tuning patch can be found on the GCC mailing list.
It's great seeing this patch surface and will hopefully be in good shape and land for the upcoming GCC 11.1 stable even though we are well into stage four (regression fixing) stage.
Timely compiler enablement is still an area where AMD could benefit from improvements. The initial/basic Znver3 support didn't land in trunk until December, after Ryzen 5000 series processors were already sharing. While EPYC 7003 is now imminent, GCC 11.1 won't be out as stable until April and in terms of Linux distribution adoption it won't appear in the likes of Ubuntu until its 21.10 release in October. And as noted the tuning for Zen 3 is still ongoing so will be carried forward to future GCC 11 point releases / GCC 12. If AMD had their Znver3 support out months ago, the eager enthusiasts on the likes of Arch Linux and Gentoo could have already been pounding on this code with Ryzen processors for a while now, helping to flesh out any issues or areas for improvement prior to new EPYC processors hitting their all important HPC and data center customers.
Intel meanwhile sent out their Sapphire Rapids and Alder Lake enablement last summer and continue to work on feature bring-up for areas like AMX. Intel's "icelake-server" target has also been mainlined for more than two years already and in released versions -- for years now Intel has been reliably getting their GCC and LLVM/Clang support upstream around a year or two ahead of launch, even when CPUs arrive on schedule. One of Intel's strengths has been with their very punctual and predictable Linux/open-source support well ahead of launch and ensuring the timing works out for generally having widespread support at launch. Hopefully this will be improved upon moving forward on the AMD side given their growing Linux customer base in areas like HPC.
Hopefully LLVM Clang picks up similar Znver3 tuning soon, but there the timing is even less opportune as LLVM 12.0 is releasing literally any day now. LLVM/Clang 12 has the basic Znver3 enablement while any further tuning will come either for a LLVM 12.0.1 point release in a few months and/or LLVM 13 in the September timeframe.
I'll be running some fresh AMD Zen 3 compiler benchmarks shortly with this latest GCC patch.