Intel's One Line Of Linux Code For Speeding-Up Sapphire Rapids On Ubuntu
Recently I noticed out-of-the-box on Ubuntu Linux the performance of Intel Xeon Scalable "Sapphire Rapids" processors was much improved for some workloads compared to tests done just weeks ago on the same Sapphire Rapids server. It ended up being an issue coming full-circle and ultimately boils down to one line of code added within the Linux kernel.
When running Linux 6.3 on existing Ubuntu Linux installations or when moving to newer official Ubuntu Linux kernel image builds on 22.04/22.10/23.04, the Intel Xeon Scalable Sapphire Rapids performance has improved. After investigating, the issue came down to a change introduced during the Linux 6.3 kernel cycle and also back-ported by Canonical to existing Ubuntu Linux releases.
What was happening? This patch to adjust the balance_performance energy performance preference (EPP) for Sapphire Rapids. The onle line of code change for Sapphire Rapids is a big help for distributions like Ubuntu Linux that make use of the P-State "powersave" governor by default.
This performance fix by Intel ended up being motivated by an earlier Phoronix article: CentOS Stream & Clear Linux Achieve Greater Performance On 4th Gen Xeon Scalable Sapphire Rapids, EPYC Genoa. In that article from February I showed the Xeon Platinum 8490H performance across a variety of distributions, including Ubuntu Linux both with its default P-State "powersave" governor and switching to the performance governor as used by default on CentOS/RHEL and other Linux distributions.
From there Intel Linux engineer Srinivas Pandruvada investigated and came up with the aforementioned patch, in which he explained:
While the majority of server OS distributions are deployed with the "performance" governor as the default, some distributions like Ubuntu use the "powersave" governor by default. While using the "powersave" governor in its default configuration on Sapphire Rapids systems leads to much lower power, the performance is lower by more than 25% for several workloads relative to the "performance" governor. A 37% difference has been reported by www.Phoronix.com [1]. This is a consequence of using a relatively high EPP value in the default configuration of the "powersave" governor and the performance can be made much closer to the "performance" governor's level by adjusting the default EPP value. Based on experiments, with EPP of 0x00, 0x10, 0x20, the performance delta between the "powersave" governor and the "performance" one is around 12%. However, the EPP of 0x20 reduces average power by 18% with respect to the lower EPP values. [Note that raising min_perf_pct in sysfs as high as 50% in addition to adjusting EPP does not improve the performance any further.] For this reason, change the EPP value corresponding to the the default balance_performance setting for Sapphire Rapids to 0x20, which is straightforward, because analogous default EPP adjustment has been applied to Alder Lake and there is a way to set the balance_performance EPP value in intel_pstate based on the processor model already. The goal here is to limit the mean performance delta between the "powersave" governor in the default configuration and the "performance" governor for a wide variety of server workloadsto to around 10-12%. For some bursty workloads, this delta can be still large, as the frequency ramp-up will still lag when the "powersave" governor is in use irrespective of the EPP setting, because the performance governor always requests the maximum possible frequency. Link: https://www.phoronix.com/review/centos-clear-spr/6 # [1]
That patch was picked up for the mainline Linux 6.3 cycle as a late merge window change. Surprisingly I hadn't noticed at the time this important one-liner being merged.
In turn this fix was back-ported to Ubuntu Linux kernel releases for the Jammy / Kinetic / Lunar kernels (Ubuntu 22.04 LTS / 22.10 / 23.04). The Ubuntu kernel builds with this P-State change being back-ported began rolling out to Ubuntu Linux users in recent weeks.
This behavior change though as a reminder is just when using Ubuntu Linux with its default powersave governor configuration. If you are already manually switching over to the P-State performance governor or using a Linux distribution that defaults to the performance governor, this change won't affect your Sapphire Rapids performance. Also, keep in mind that for my CPU server reviews and the like I am testing all CPUs already in the "performance" governor mode so this change ultimately doesn't impact previously shown review results of Sapphire Rapids vs. AMD EPYC Genoa and similar where all were running in the performance governor...
I'd still love to see Canonical switch over to using the performance governor on Ubuntu Server installations, but that's a separate matter... In any event, for those curious about what this means for Intel Sapphire Rapids performance when running with powersave / out-of-the-box on Ubuntu Linux, I ran some comparison benchmarks.