Tuned AMD Zen 4 Scheduler Model Lands In LLVM 17 Compiler
Back in December initial AMD Zen 4 "znver4" support was merged for the LLVM/Clang 16 compiler. While the "-march=znver4" targeting at least flips on the newly-added AVX-512 instructions with these AMD processors, it was re-using the existing scheduler model from Zen 3. Finally today a tuned Zen 4 scheduler model has landed for what will be found in the LLVM 17 compiler later this year.
The AMD Zen 4 "znver4" scheduler model tuned for these newest AMD processors was merged a few minutes ago to LLVM Git rather than re-using the (not very accurate) Zen 3 model on these new Ryzen 7000 series and EPYC 9004 series processors. The scheduler model is primarily tuned for 4th Gen EPYC 9004 "Genoa" processors.
AMD compiler engineer Ganesh Gopalasubramanian commented in the merge request, "The patch has the details of the znver4 scheduler model. There are ample improvements with respect to instructions, execution units, latencies and throughput when compared with znver3."
The new commit is 34,544 new lines to the LLVM code-base -- counting the new test cases and model itself. Sadly it missed out on LLVM 16.0 that is being released in the coming days but is now merged for LLVM 17 and could potentially be back-ported for an LLVM 16.0 point release if all goes well.
GCC 13 is also releasing in the coming weeks with its Zen 4 support worked on by AMD and SUSE. For those wanting a production-ready compiler right now, AMD's AOCC 4.0 has great Zen 4 support for that downstream of LLVM/Clang. It's too bad though it's taken until several months after launch -- and LLVM 17.0 won't be released until ~September unless this gets back-ported to a 16.0.x point release -- before this tuned Znver4 support is ready for optimizing binaries on Ryzen 7000 series and EPYC 9004 series systems.
I'll be working on some fresh Zen 4 compiler benchmarks shortly given this latest LLVM activity and the ongoing improvements that have been squeezed in for GCC 13.
The AMD Zen 4 "znver4" scheduler model tuned for these newest AMD processors was merged a few minutes ago to LLVM Git rather than re-using the (not very accurate) Zen 3 model on these new Ryzen 7000 series and EPYC 9004 series processors. The scheduler model is primarily tuned for 4th Gen EPYC 9004 "Genoa" processors.
AMD compiler engineer Ganesh Gopalasubramanian commented in the merge request, "The patch has the details of the znver4 scheduler model. There are ample improvements with respect to instructions, execution units, latencies and throughput when compared with znver3."
The new commit is 34,544 new lines to the LLVM code-base -- counting the new test cases and model itself. Sadly it missed out on LLVM 16.0 that is being released in the coming days but is now merged for LLVM 17 and could potentially be back-ported for an LLVM 16.0 point release if all goes well.
GCC 13 is also releasing in the coming weeks with its Zen 4 support worked on by AMD and SUSE. For those wanting a production-ready compiler right now, AMD's AOCC 4.0 has great Zen 4 support for that downstream of LLVM/Clang. It's too bad though it's taken until several months after launch -- and LLVM 17.0 won't be released until ~September unless this gets back-ported to a 16.0.x point release -- before this tuned Znver4 support is ready for optimizing binaries on Ryzen 7000 series and EPYC 9004 series systems.
I'll be working on some fresh Zen 4 compiler benchmarks shortly given this latest LLVM activity and the ongoing improvements that have been squeezed in for GCC 13.
Add A Comment