Intel P-State Patches Further Tune Linux For Better Scheduling On Hybrid CPUs

Written by Michael Larabel in Intel on 24 June 2024 at 10:00 AM EDT. 3 Comments
INTEL
A new set of patches are currently being tested for improving task scheduling and in turn performance for modern Intel Core hybrid processors. With the patch also mentioning systems that are hybrid but without SMT, this looks like it may be some early tuning as well for upcoming Intel Lunar Lake processors that lack Hyper Threading.

Intel engineer and Linux power management maintainer Rafael Wysocki is testing a new set of Core hybrid system optimizations within an "intel_pstate-testing" Git branch. The focus on these new patches is properly accounting for asymmetric CPU capacity on hybrid systems within the Intel P-State CPU frequency scaling driver.

Wysocki explains in this patch that's adding the new feature code:
"Make intel_pstate use the HWP_HIGHEST_PERF values from MSR_HWP_CAPABILITIES to set asymmetric CPU capacity information via the previously introduced arch_set_cpu_capacity() on hybrid systems without SMT.

Setting asymmetric CPU capacity is generally necessary to allow the scheduler to compute task sizes in a consistent way across all CPUs
in a system where they differ by capacity. That, in turn, should help to improve scheduling decisions. It is also necessary for the schedutil cpufreq governor to operate as expected on hybrid systems where tasks migrate between CPUs of different capacities.

The underlying observation is that intel_pstate already uses MSR_HWP_CAPABILITIES to get CPU performance information which is exposed by it via sysfs and CPU performance scaling is based on it. Thus using this information for setting asymmetric CPU capacity is consistent with what the driver has been doing already. Moreover, HWP_HIGHEST_PERF reflects the maximum capacity of a given CPU including both the instructions-per-cycle (IPC) factor and the maximum turbo frequency and the units in which that value is expressed are the same for all CPUs in the system, so the maximum capacity ratio between two CPUs can be obtained by computing the ratio of their HWP_HIGHEST_PERF values. Of course, in principle that capacity ratio need not be directly applicable at lower frequencies, so using it for providing the asymmetric CPU capacity information to the scheduler is a rough approximation, but it is as good as it gets. Also, measurements indicate that this approximation is not too bad in practice.

If the given system is hybrid and non-SMT, the new code disables ITMT support in the scheduler (because it may get in the way of asymmetric CPU capacity code in the scheduler that automatically gets enabled by setting asymmetric CPU capacity) after initializing all online CPUs and finds the one with the maximum HWP_HIGHEST_PERF value. Next, it computes the capacity number for each (online) CPU by dividing the product of its HWP_HIGHEST_PERF and SCHED_CAPACITY_SCALE by the maximum HWP_HIGHEST_PERF.

When a CPU goes offline, its capacity is reset to SCHED_CAPACITY_SCALE and if it is the one with the maximum HWP_HIGHEST_PERF value, the capacity numbers for all of the other online CPUs are recomputed. This also takes care of a cleanup during driver operation mode changes.

Analogously, when a new CPU goes online, its capacity number is updated and if its HWP_HIGHEST_PERF value is greater than the current maximum one, the capacity numbers for all of the other online CPUs are recomputed.

The case when the driver is notified of a CPU capacity change, either through the HWP interrupt or through an ACPI notification, is handled similarly to the CPU online case above, except that if the target CPU is the current highest-capacity one and its capacity is reduced, the capacity numbers for all of the other online CPUs need to be recomputed either."

With the focus on hybrid systems but without SMT (Hyper Threading), the motivation is presumably for upcoming Lunar Lake processors that for their P cores do away with HT.

Intel Lunar Lake core layout


The Intel P-State testing patches do not offer any quantitative assessment for the impact on performance / improved scheduling of this latest code compared to the current Linux 6.10 upstream. Hopefully these testing patches will be deemed ready for merging in time for next month's Linux 6.11 merge window.

Separately, within the Linux power management's "-next" branch is this patch that updates the Intel Lunar Lake hybrid scaling factor for the Intel P-State driver. The hybrid scaling factor increases from 80000 with Meteor Lake / Arrow Lake to now a value of 86957 for Lunar Lake. This further distinguishes between the P and E core scaling behavior with the P-State driver for upcoming Lunar Lake processors.
Related News
About The Author
Michael Larabel

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.

Popular News This Week