GCC's Profile Guided Optimization Performance With The Ryzen 9 5950X
Given the talk in prior days around patches for PGO'ing the Linux kernel and some readers not being familiar with Profile Guided Optimizations by code compilers, here are some fresh benchmarks on a Ryzen 9 5950X looking at the benefits of applying PGO optimizations to various benchmarks.
We have benchmarked GCC and Clang PGO performance many times over the years with this just being some fresh data using a Ryzen 9 5950X and the latest software stack on Ubuntu 20.10. The testing was done by first running various open-source benchmarks without PGO, repeating the tests to generate profiles for the compiler to consume with PGO, and then benchmarking those PGO-enabled builds. These numbers are about best case scenarios given that with the testing for the PGO-enabled build, the benchmarks are repeated and thus matching well to the original profile. In more real-world, general purpose scenarios it can be more difficult generating an accurate profile for your actual workflow / software usage.
Via the Phoronix Test Suite various programs were tested with/without PGO for looking at the impact. With new LLVM and GCC releases abound, a larger comparison will be coming up in the weeks ahead to complement this weekend benchmarking fun.
The TSCP chess benchmark saw the largest gain in performance from the Profile Guided Optimizations.
Enabling PGO was offering up to a few percent faster performance on this Ryzen 9 5950X + Ubuntu 20.10 system on top of other compiler optimizations. But compared to say raising the optimization level or enabling LTO, it's not as easy as just throwing a compiler switch but does require accurate profiles for the greatest impact. So that's the quick look for now at the current Profile Guided Optimization performance while more will be on the way for GCC 11 and Clang 12.
We have benchmarked GCC and Clang PGO performance many times over the years with this just being some fresh data using a Ryzen 9 5950X and the latest software stack on Ubuntu 20.10. The testing was done by first running various open-source benchmarks without PGO, repeating the tests to generate profiles for the compiler to consume with PGO, and then benchmarking those PGO-enabled builds. These numbers are about best case scenarios given that with the testing for the PGO-enabled build, the benchmarks are repeated and thus matching well to the original profile. In more real-world, general purpose scenarios it can be more difficult generating an accurate profile for your actual workflow / software usage.
Via the Phoronix Test Suite various programs were tested with/without PGO for looking at the impact. With new LLVM and GCC releases abound, a larger comparison will be coming up in the weeks ahead to complement this weekend benchmarking fun.
The TSCP chess benchmark saw the largest gain in performance from the Profile Guided Optimizations.
Enabling PGO was offering up to a few percent faster performance on this Ryzen 9 5950X + Ubuntu 20.10 system on top of other compiler optimizations. But compared to say raising the optimization level or enabling LTO, it's not as easy as just throwing a compiler switch but does require accurate profiles for the greatest impact. So that's the quick look for now at the current Profile Guided Optimization performance while more will be on the way for GCC 11 and Clang 12.
20 Comments