Profile Guided Optimizations (PGO) Likely Coming To Linux 5.14 For Clang
When Clang'ing the Linux kernel there has recently been support introduced for link-time optimizations (LTO) as another performance win. In turn this also allowed Clang Control Flow Integrity (CFI) support to also land in the mainline kernel. In the past there were patches to the Linux kernel to support GCC's LTO and PGO functionality but they hadn't been mainlined.
Clang PGO support for the Linux kernel is currently residing within for-next/clang/features by Google's Kees Cook. This Clang PGO support for the kernel was worked on by Google engineers where they have for years been using Clang to build the Linux kernel and their other components for Android and Chrome OS. Given the for-next marking, it's looking like this PGO support for the kernel will be submitted for the upcoming Linux 5.14 merge window.
The Clang PGO infrastructure allows for raw profile data to be collected via /sys/kernel/debug/pgo/profraw.. That raw profile data then needs to be processed using the llvm-profdata tool at which stage multiple profiles can also be merged. When rebuilding for the PGO-optimized kernel, the processed profile data can be passed back in via the compiler flags, such as make LLVM=1 KCFLAGS=-fprofile-use=vmlinux.profdata.
Initially this Clang PGO for the Linux kernel support is limited to x86/x86_64 until it's been verified on other architectures.
There still are more improvements pending for this Linux kernel PGO support that aren't yet in that Git branch, such as module profile data. With the currently staged code the profile data from kernel modules isn't properly handled, but that pending patch series addresses that so kernel modules can also be properly benefit from PGO.
Long story short this work that will hopefully be landing for Linux 5.14 allows for Profile Guided Optimizations to work with the Clang compiler. Those interested can build a kernel with this new PGO infrastructure, boot that kernel and run their desired/relevant workloads, collect the profile(s) and process them, and then rebuild the kernel leveraging said profile data. This PGO-enabled kernel build should ideally hold some performance benefits thanks to Clang being able to make wiser choices thanks to that collected data.
It will be interesting to benchmark this Clang PGO support for the Linux kernel to see how it performs. PGO can really help application performance for sufficiently complex code-bases and where accurate profiles can be collected.