Announcement

Collapse
No announcement yet.

Significant Performance & Perf-Per-Watt Gains Coming For Intel CPUs On Linux Schedutil

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Significant Performance & Perf-Per-Watt Gains Coming For Intel CPUs On Linux Schedutil

    Phoronix: Significant Performance & Perf-Per-Watt Gains Coming For Intel CPUs On Linux Schedutil

    Sadly not making it for the just-closed Linux 5.4 merge window but hopefully something we could see in Linux 5.5 is recent patches on "frequency invariance" in optimizing the Schedutil frequency scaling governor that will really benefit Intel CPUs and improve their performance by double digits...

    http://www.phoronix.com/scan.php?pag...ncy-Invariance

  • atomsymbol
    replied
    Dynamic power consumed by CPU = frequency*capacitance*square(voltage), with the constraint/relationship that an increase in frequency requires an increase in voltage. Avoiding turbo boost frequencies when running on battery means lower voltage requirements.

    https://en.wikipedia.org/wiki/CPU_po...pation#Sources

    Leave a comment:


  • Shevchen
    replied
    Originally posted by Michael View Post

    It relies upon some Intel specific MSRs.
    Gotcha, thx.

    Leave a comment:


  • Michael
    replied
    Originally posted by Shevchen View Post
    Maybe I've missed a point when reading the article, but the general behavior seems to improve all X86 CPUs, correct? Why are Intel CPUs pointed out then?

    Reading the last couple of months on CPU tweaking improvements, I got the impression, that AMDs Zen 2 cores benefit from proper transient response on boost behavior by keeping boosting cores in the boosting state as long as the thermal limit isn't hit as well as keeping a task on the same CCX/CCD that is already processed as well as inter-core processing to switch between "hot" and "cold" cores as long as the required data is still in the L1/L2 cache.

    Is there a bias towards Intel CPUs or does the work on AMD CPUs just ahead?

    The last thing Linux needs are biased performance governor, so I'll just ask nicely if there is a bias.
    It relies upon some Intel specific MSRs.

    Leave a comment:


  • Shevchen
    replied
    Maybe I've missed a point when reading the article, but the general behavior seems to improve all X86 CPUs, correct? Why are Intel CPUs pointed out then?

    Reading the last couple of months on CPU tweaking improvements, I got the impression, that AMDs Zen 2 cores benefit from proper transient response on boost behavior by keeping boosting cores in the boosting state as long as the thermal limit isn't hit as well as keeping a task on the same CCX/CCD that is already processed as well as inter-core processing to switch between "hot" and "cold" cores as long as the required data is still in the L1/L2 cache.

    Is there a bias towards Intel CPUs or is the work on AMD CPUs just ahead?

    The last thing Linux needs are biased performance governor, so I'll just ask nicely if there is a bias.

    Leave a comment:


  • perpetually high
    replied
    Originally posted by geearf View Post

    You have to either disable pstate or switch it to passive, if not you have no cpufreq governor.
    Originally posted by skeevy420 View Post

    I use the "cpupower" package and its systemd unit. It's configured under "/etc/default/cpupower". I'm including mine because I've added some notes from the man pages it says to reference that only contains relevant information on the perf_bias setting. On my Manjaro box, enabling either of the smp_* and mc_* settings prevents it from working so I don't use them. Using 0 on the perf_bias option is the same as doing the command I posted above.
    Big thanks to both of you. I'm finally up and running. The intel_pstate=passive was what I was missing! I did some light gaming and multitasking on the desktop and schedutil was very smooth. This is with the patches Michael posted about so I can't compare it to without but I'm impressed so far. Gonna try it out for a bit.

    Code:
    ~ ❯ sudo cpupower frequency-set --governor schedutil
    Setting cpu: 0
    Error setting new values. Common errors:
    - Do you have proper administration rights? (super-user?)
    - Is the governor you requested available and modprobed?
    - Trying to set an invalid policy?
    - Trying to set a specific frequency, but userspace governor is not available,
       for example because of hardware which cannot be set to a specific frequency
       or because the userspace governor isn't loaded?
    ~ ❯ sudo cpupower frequency-set --governor powersave
    Setting cpu: 0
    Setting cpu: 1
    Setting cpu: 2
    Setting cpu: 3
    ~ ❯ sudo cpupower frequency-set --governor performance
    Setting cpu: 0
    Setting cpu: 1
    Setting cpu: 2
    Setting cpu: 3
    ~ ❯ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver
    intel_pstate
    ~ ❯ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
    performance
    ~ ❯ cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_available_governors
    performance powersave
    performance powersave
    performance powersave
    performance powersave
    After adding intel_pstate=passive to GRUB and rebooting:

    Code:
    ~ ❯ cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_available_governors
    conservative ondemand userspace powersave performance schedutil
    conservative ondemand userspace powersave performance schedutil
    conservative ondemand userspace powersave performance schedutil
    conservative ondemand userspace powersave performance schedutil
    ~ ❯ sudo cpupower frequency-set --governor schedutil
    Setting cpu: 0
    Setting cpu: 1
    Setting cpu: 2
    Setting cpu: 3
    ~ ❯ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver
    intel_cpufreq
    ~ ❯ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
    schedutil

    Leave a comment:


  • pruzinat
    replied
    Hm, I have always disabled all frequency scaling in kernel and relied on C-states, never seemed to have made a big difference (last time I measured it was like a watt on laptop and 3W on desktop as measured by intels performance counters thingy). Does scheduler & cpufreq/pstate actually make noticeable difference to how long does laptop last on battery? I always thought that power management firmware in modern laptops did most of the work. As far as desktop goes, difference between 30W and 35W draw from outlet never seemed important (my old SIPS monitor has 12k hours of backlight and ~40W draw so drop in a bucket).

    Leave a comment:


  • skeevy420
    replied
    Originally posted by nuetzel View Post
    So true, even if we have to remember this:
    That quote you added is why I use it now instead of ondemand. I made the switch somewhere around 4.17 or 4.18 IIRC.

    GruenSein

    While I can't comment for others, on my system schedutil has no problems holding the boost frequency with or without this patch. I have the best cooling possible for my Dell workstation (OEM) and that keeps my CPUs under 65C & 75C at full load (CPU 0 has a bigger, better heat sync than CPU 1 hence the two reported highs).

    Leave a comment:


  • nuetzel
    replied
    Originally posted by GruenSein View Post
    How is this evaluation procedure of CPU load specific to Intel? If the load is never evaluated based on the CPUs maximum capability, this might apply to all CPUs which adjust their frequency based on current load.
    So true, even if we have to remember this:

    'Meanwhile, kernel developers are hoping for a future where Schedutil could potentially replace the other existing frequency scaling governors.'

    Leave a comment:


  • skeevy420
    replied
    Originally posted by perpetually high View Post

    Compiled for me on 5.3.2 as well, failed on 5.4-rc1. Thanks for the tip on x86_energy_perf_policy (and the zstd patch!)

    Question for you - so uh, how do I enable schedutil? I'm on intel_pstate at the moment and can't seem to switch. I have CONFIG_CPU_FREQ_GOV_SCHEDUTIL=y in my kernel config as well.
    I use the "cpupower" package and its systemd unit. It's configured under "/etc/default/cpupower". I'm including mine because I've added some notes from the man pages it says to reference that only contains relevant information on the perf_bias setting. On my Manjaro box, enabling either of the smp_* and mc_* settings prevents it from working so I don't use them. Using 0 on the perf_bias option is the same as doing the command I posted above.

    Code:
    # Define CPUs governor
    # valid governors: ondemand, performance, powersave, conservative, userspace.
    governor='schedutil'
    
    # Limit frequency range
    # Valid suffixes: Hz, kHz (default), MHz, GHz, THz
    #min_freq="2.25GHz"
    #max_freq="3GHz"
    
    # Specific frequency to be set.
    # Requires userspace governor to be available.
    # Do not set governor field if you use this one.
    #freq=
    
    # Utilizes cores in one processor package/socket first before processes are
    # scheduled to other processor packages/sockets.
    # See man (1) CPUPOWER-SET for additional details.
    # From Red Hat
    # Restricts the use of power by system processes to the cores in one CPU package
    # before other CPU packages are drawn from. 0 sets no restrictions, 1 initially
    # employs only a single CPU package, and 2 does this in addition to favoring
    # semi-idle CPU packages for handling task wakeups.
    #mc_scheduler=0
    
    # Utilizes thread siblings of one processor core first before processes are
    # scheduled to other cores. See man (1) CPUPOWER-SET for additional details.
    # From Red Hat
    # Restricts the use of power by system processes to the thread siblings of one CPU
    # core before drawing on other cores. 0 sets no restrictions, 1 initially employs only a
    # single CPU package, and 2 does this in addition to favouring semi-idle CPU packages
    # for handling task wakeups.
    #smp_scheduler=0
    
    # Sets a register on supported Intel processore which allows software to convey
    # its policy for the relative importance of performance versus energy savings to
    # the  processor. See man (1) CPUPOWER-SET for additional details.
    # from "man cpupower-set"
    # Sets a register on supported Intel processore which allows software to convey its policy
    # for the relative importance of performance versus energy savings to the  processor.
    # The range of valid numbers is 0-15, where 0 is maximum performance and 15 is maximum energy efficiency.
    perf_bias=0
    
    # vim:set ts=2 sw=2 ft=sh et:

    Leave a comment:

Working...
X