Linux 6.12 To Drop Old Code That Slows Down CPU Frequency Polling

Written by Michael Larabel in Linux Kernel on 3 August 2024 at 01:00 PM EDT. 14 Comments
LINUX KERNEL
The Linux 6.12 kernel cycle later this year has a change coming that will impact users of the "Schedutil" CPU frequency scaling governor. This change is dropping the "LATENCY_MULTIPLIER" that has been within the kernel code the past two decades to slowdown how frequent the CPU frequency evaluation occurs. In turn the revised logic can allow for that CPUFreq frequency re-evaluation to occur more often.

The CPUFreq LATENCY_MULTIPLIER causes the polling frequency to be 1000x the transition latency of the processor -- with some exceptions / limits to the maximum delay. That 1000x multiplier once made sense but not so much anymore with modern processors. Qais Yousef who pushed for the LATENCY_MULTIPLIER removal explained in his patch:
"The current LATENCY_MULTIPLIER which has been around for nearly 20 years causes rate_limit_us to be always in ms range.

On M1 mac mini I get 50 and 56us transition latency, but due to the 1000 multiplier we end up setting rate_limit_us to 50 and 56ms, which gets capped into 2ms and was 10ms before e13aa799c2a6 ("cpufreq: Change default transition delay to 2ms")

On Intel I5 system transition latency is 20us but due to the multiplier we end up with 20ms that again is capped to 2ms.

Given how good modern hardware and how modern workloads require systems to be more responsive to cater for sudden changes in workload (tasks sleeping/wakeup/migrating, uclamp causing a sudden boost or cap) and that 2ms is quarter of the time of 120Hz refresh rate system, drop the old logic in favour of providing 50% headroom.

rate_limit_us = 1.5 * latency.

I considered not adding any headroom which could mean that we can end up with infinite back-to-back requests.

I also considered providing a constant headroom (e.g: 100us) assuming that any h/w or f/w dealing with the request shouldn't require a large headroom when transition_latency is actually high.

But for both cases I wasn't sure if h/w or f/w can end up being overwhelmed dealing with the freq requests in a potentially busy system. So I opted for providing 50% breathing room.

This is expected to impact schedutil only as the other user, dbs_governor, takes the max(2*tick, transition_delay_us) and the former was at least 2ms on 1ms TICK, which is equivalent to the max_delay_us before applying this patch. For systems with TICK of 4ms, this value would have almost always ended up with 8ms sampling rate.

For systems that report 0 transition latency, we still default to returning 1ms as transition delay.

This helps in eliminating a source of latency for applying requests..."

This patch to remove the latency multiplier to help with lowering the latency during CPU frequency evaluation/selection is being picked up by the power management subsystem changes intended for the Linux 6.12 kernel.

AMD and Intel desktop CPUs


We'll see what more CPUFreq and P-State driver enhancements come for the power management code over the coming weeks to benefit Linux 6.12, which is likely to be this year's Long Term Support (LTS) kernel version.
Related News
About The Author
Michael Larabel

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.

Popular News This Week