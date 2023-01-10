Show Your Support: This site is primarily supported by advertisements. Ads are what have allowed this site to be maintained on a daily basis for the past 18+ years. We do our best to ensure only clean, relevant ads are shown, when any nasty ads are detected, we work to remove them ASAP. If you would like to view the site without ads while still supporting our work, please consider our ad-free Phoronix Premium.
More AMD Zen 4 Tuning Ongoing For GCC 13 Compiler
In recent weeks Hubicka has been fiddling around with the new "znver4" target in GCC 13. Today he sent out more znver4 x86-tune flags with looking to make some more micro-optimizations to this -march=znver4 targeting for AMD Ryzen 7000 series and EPYC 9004 series processors.
Hubicka sums up the latest work as:
this patch adds more tunes for zen4:
- new tunes for avx512 scater instructions. In micro benchmarks these seems consistent loss compared to open-coded coe
- disable use of gather for zen4 While these are win for a micro benchmarks (based on TSVC), enabling gather is a loss for parest. So for now it seems safe to keep it off.
- disable pass to avoid FMA chains for znver4 since fmadd was optimized and does not seem to cause regressions.
Once this compiler tuning work settles down for GCC, I'll run some fresh benchmarks -- especially for seeing how it compares to AMD's AOCC 4.0 compiler. The stable GCC 13.1 release should be out in March~April depending upon how the rest of the release cycle plays out. Fedora 38 is planning to be among the first Linux distributions shipping this new compiler while it won't reach the likes of Ubuntu users out-of-the-box until October with the distribution's 23.10 release.
Checking the LLVM review queue this morning, on the LLVM/Clang front for upstream review there still isn't anything new yet over the basic znver4 enablement sent out by AMD that was merged a few weeks ago.