Squeezing More Performance Out Of The Linux Kernel With Clang + LTO
With the Linux 5.12 kernel bringing support for building the kernel with link-time optimizations (LTO) when using the LLVM Clang compiler, here are some benchmarks looking at that performance impact as well as more generally seeing how the LLVM Clang compiler performance is looking when building the Linux kernel relative to GCC.
Recently using Linux 5.14-rc1 I was carrying out benchmarks of this latest Linux kernel tree built under GCC 11 and then again with LLVM Clang 12 and lastly with LLVM Clang 12 while enabling the kernel LTO support. Tests were carried out on both an AMD Ryzen 9 5950X and Intel Core i9 11900K desktops for this initial testing. The same standard kernel configuration was used when testing these two compilers in their release builds. The benchmarks/software under test were maintained the same when testing the kernel builds and not re-built or any other changes besides the kernel under test.
First up is a look at the performance on the Ryzen 9 5950X system...
One of the biggest winners from using Clang for compiling the kernel and especially when re-building with link-time optimizations was LevelDB, the key-value store written by Google and used by Chrome and many other software packages out there. LevelDBw as consistently performing better when built with Clang rather than GCC and even more so when employing link-time optimizations.