The Performance Impact Of GCC CPU Tuning On The Linux Kernel's Performance
Last week there was the patch being proposed for the mainline Linux kernel that has long been carried by Gentoo's kernel to provide CPU optimization options, which were quickly shot-down by upstream maintainers, there were many requests to benchmark said patches... Here are dozens of performance figures looking at the performance impact of these optimizations for AMD Zen (znver1), Skylake, and Skylake X (Skylake-AVX512) compared to a stock mainline kernel build on several different systems.
The main patch proposed offers up Kconfig options so at build time users can select their CPU microarchitecture from old AMD Barcelona and Bobcat systems through Znver1 and then on the Intel side from Nehalem through Icelake and Cannonlake generations. Depending upon the CPU generation selected, the kernel would be built with the respective "-march=" compiler flag for optimizing the generated instructions for that particular generation of x86_64 processors. This functionality has long been an option for Gentoo users building their own kernel to cater their own build for their particular CPU in use, but over the years hasn't been accepted upstream. The patch is quite simple with really GCC doing all of the actual work for the optimizations of the generated kernel binary.
This latest attempt was quickly shot down like in the past, citing the previous statements from upstream developers. The reasons against such "-march=" CPU microarchitecture tuning for the kernel build has been due to developers unconvinced by the performance suggestions, the possibility of compiler updates regressing this functionality and leading to slower performance in the long-term, and the maintenance burden involved. If anything it's been suggested before just having some basic tunables for optimizing for "modern" Intel CPUs or AMD CPUs, respectively, but not to the level of per CPU generation tuning.
For providing some independent numbers, I applied this patch against the Linux 5.0 Git kernel and ran a number of benchmarks on different systems. Using the same kernel source tree and Kconfig, I first built a stock kernel for reference without any of this tuning and then separate kernel builds when cycling through the ZNVER1, SKYLAKE, and SKYLAKEX options for generating optimized kernel builds for AMD Zen, Intel Skylake, and Intel Skylake X/AVX-512. I then used the respective kernel builds for testing different systems including:
- AMD Ryzen 7 2700X (znver1)
- AMD Ryzen Threadripper 2990WX (znver1)
- Intel Core i7 8086K (skylake)
- Intel Core i9 7980XE (skylakex)
On each system I compared the stock kernel build to the respective optimized kernel while not changing any other hardware/software components between tests.
Do note that the RAM, storage, and graphics are different between systems. The intent of this testing isn't to compare the CPUs themselves but rather looking for any trends whether the "-march=" GCC tuning pays off for the kernel builds while leaving the rest of the systems the same. Ubuntu 18.10 was running on the systems under test and dozens of benchmarks were facilitated in a fully-automated and reproducible manner using the open-source Phoronix Test Suite.
If you enjoy all of the benchmarks on Phoronix, consider showing your support by making a PayPal tip or join premium.