AMD Zen 2 "Znver2" Optimizations With LLVM Clang 10 Bring Some Improvements
With LLVM Clang 10 having added a Zen 2 scheduler model tuned for the latest AMD CPUs over the existing "znver2" tuning that had just copied the Zen 1 scheduler, here are some benchmarks looking at the LLVM Clang 9 vs. 10 compiler performance on AMD EPYC when making use of "-march=znver2" optimizations.
On the AMD EPYC 7742 2P server running Ubuntu 19.10 with the Linux 5.5 kernel, I carried out benchmarks earlier this month comparing the LLVM Clang 9.0.1 performance to that of LLVM Clang 10.0 after the Zen 2 (znver2) improvements landed and around the time of the LLVM 10.0 branching.
Both Clang 9 and Clang 10 Git were built the same in their release modes. As usual with compiler optimizations/tuning when looking at the performance across dozens of workloads, the results are mixed:
GraphicsMagick and PostgreSQL saw some big wins when using LLVM Clang 10 and also some smaller improvements in some of the video encode and compression tests. To no surprise, Clang 9 was building faster than Clang 10 given that newer compilers add more optimization passes and tuning thus taking longer to compile code in an effort to produce faster binaries. If ignoring the timed compilation results and then the multiple libgav1 tests, Clang 10 overall is looking good with just a few losses.
Out of 73 C/C++ benchmarks tested between these Clang compiler builds, Clang 10 did lead nearly 60% of the time.
If taking the geometric mean of all those benchmark results, Clang 10.0 Git and Clang 9 were neck-and-neck with this AMD Zen 2 targeting.
See all of these benchmarks in full via this OpenBenchmarking.org result file.
On the AMD EPYC 7742 2P server running Ubuntu 19.10 with the Linux 5.5 kernel, I carried out benchmarks earlier this month comparing the LLVM Clang 9.0.1 performance to that of LLVM Clang 10.0 after the Zen 2 (znver2) improvements landed and around the time of the LLVM 10.0 branching.
Both Clang 9 and Clang 10 Git were built the same in their release modes. As usual with compiler optimizations/tuning when looking at the performance across dozens of workloads, the results are mixed:
GraphicsMagick and PostgreSQL saw some big wins when using LLVM Clang 10 and also some smaller improvements in some of the video encode and compression tests. To no surprise, Clang 9 was building faster than Clang 10 given that newer compilers add more optimization passes and tuning thus taking longer to compile code in an effort to produce faster binaries. If ignoring the timed compilation results and then the multiple libgav1 tests, Clang 10 overall is looking good with just a few losses.
Out of 73 C/C++ benchmarks tested between these Clang compiler builds, Clang 10 did lead nearly 60% of the time.
If taking the geometric mean of all those benchmark results, Clang 10.0 Git and Clang 9 were neck-and-neck with this AMD Zen 2 targeting.
See all of these benchmarks in full via this OpenBenchmarking.org result file.
6 Comments