Announcement

Collapse
No announcement yet.

AMD Performance On Linux 5.11 Remains Mixed Due To Schedutil With Frequency Invariance

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Originally posted by jeisom View Post

    I just tested this on my 3800x and linux 5.10 and the temps went up about 9C while idling when switched to the performance governor. It might have gone up more, But I switched back after it got to 43. cpupower reported ~4GHz on all cores on performance as well idling.
    Then obviously AMD's Ryzen CPUs are bugged, since Intel CPUs don't exhibit the same behaviour - yes, even with the acpi_cpufreq performance governor!

    Comment


    • #22
      Michael
      I'm wondering if "rate_limit_us" tunable would help here

      Originally posted by Linuxxx View Post
      What's actually interesting here is this little detail:

      Scaling Governor: acpi-cpufreq schedutil

      Now when checking the value of
      /sys/devices/system/cpu/cpufreq/schedutil/rate_limit_us it should read
      10000 (i.e. 10ms) which is just way too high! Try what Android already does (echo 0 | sudo tee /sys/devices/system/cpu/cpufreq/schedutil/rate_limit_us) and see how that works out for you (which could be anyone, really).
      There's different values for different hardware.
      "10000" - for cpufreq
      "500" - for intel_pstate (and probably intel_cpufreq without HWP)
      "5000" - for intel_cpufreq with HWP
      Last edited by xAlt7x; 05 January 2021, 09:29 AM.

      Comment


      • #23
        Originally posted by xAlt7x View Post
        Michael
        I wonder if "rate_limit_us" tunable would help here


        There's different values for different hardware.
        "10000" - for cpufreq
        "500" - for intel_pstate (and probably intel_cpufreq without HWP)
        "5000" - for intel_cpufreq with HWP
        I had already tried rate_limit_us tuning, no major difference.
        Michael Larabel
        https://www.michaellarabel.com/

        Comment


        • #24
          Originally posted by Michael View Post

          I had already tried rate_limit_us tuning, no major difference.
          The real difference setting "rate_limit_us" to zero makes is in the way your system reacts to a sudden change in load, i.e. how 'snappy' your machine feels while using it (therefore latency-related).
          Now Android by default does the only sensible thing here:
          By setting this value to zero they ensure that your system (most likely your smartphone) will 'feel' as smooth as possible while using it, so that it doesn't need to 'feel' particularly ashamed when compared to Apple's iOS.

          Of course the same can't be said for desktop Linux distros, where the default is guaranteed to introduce stutters everywhere...

          Anyway, the only place where schedutil makes any sense is in an APU system, where the power budget is shared between the CPU & GPU (similar in nature to a smartphone SoC, hence the use of schedutil by Android).
          Anyone else should just stick to the performance governor for best results - or, if using an AMD Ryzen system, try by blocking the acpi-cpufreq driver and see how that compares to the default...

          Come to think of it, maybe that would make for an interesting benchmark comparison, Michael ?

          Comment


          • #25
            Originally posted by Linuxxx View Post

            The real difference setting "rate_limit_us" to zero makes is in the way your system reacts to a sudden change in load, i.e. how 'snappy' your machine feels while using it (therefore latency-related).
            Now Android by default does the only sensible thing here:
            By setting this value to zero they ensure that your system (most likely your smartphone) will 'feel' as smooth as possible while using it, so that it doesn't need to 'feel' particularly ashamed when compared to Apple's iOS.

            Of course the same can't be said for desktop Linux distros, where the default is guaranteed to introduce stutters everywhere...

            Anyway, the only place where schedutil makes any sense is in an APU system, where the power budget is shared between the CPU & GPU (similar in nature to a smartphone SoC, hence the use of schedutil by Android).
            Anyone else should just stick to the performance governor for best results - or, if using an AMD Ryzen system, try by blocking the acpi-cpufreq driver and see how that compares to the default...

            Come to think of it, maybe that would make for an interesting benchmark comparison, Michael ?
            I don't recall if I tried a rate_limit_us of 0 but will try otherwise if so to see if that helps much.
            Michael Larabel
            https://www.michaellarabel.com/

            Comment


            • #26
              Originally posted by Linuxxx View Post

              The real difference setting "rate_limit_us" to zero makes is in the way your system reacts to a sudden change in load, i.e. how 'snappy' your machine feels while using it (therefore latency-related)
              I'm not sure it's a "silver bullet". We should also evaluate latency under load. With "ondemand" cpufreq governor we're doing exactly oposite to make system smoother during kernel compilation.
              Code:
              echo 10 | sudo tee /sys/devices/system/cpu/cpufreq/ondemand/sampling_down_factor
              It's just a thought, I really have no idea why those values differ so much for various modes.

              Comment


              • #27
                Originally posted by Linuxxx View Post

                The real difference setting "rate_limit_us" to zero makes is in the way your system reacts to a sudden change in load, i.e. how 'snappy' your machine feels while using it (therefore latency-related).
                Now Android by default does the only sensible thing here:
                By setting this value to zero they ensure that your system (most likely your smartphone) will 'feel' as smooth as possible while using it, so that it doesn't need to 'feel' particularly ashamed when compared to Apple's iOS.
                Are you sure this is the right parameter? Most of the perceived stuttering will most likely come from CPU having to wake up from idle states. AFAIK this takes a lot longer than switching frequency of an active CPU and poses much higher performance penalty because idle CPU is literally doing nothing. C-State transitions have nothing to do with schedutil (or any other cpufreq governor). Android default are probably optimized for ARM devices which probably behave differently in terms of frequency scaling.

                Comment


                • #28
                  Originally posted by MadCatX View Post
                  Are you sure this is the right parameter? Most of the perceived stuttering will most likely come from CPU having to wake up from idle states. AFAIK this takes a lot longer than switching frequency of an active CPU and poses much higher performance penalty because idle CPU is literally doing nothing. C-State transitions have nothing to do with schedutil (or any other cpufreq governor). Android default are probably optimized for ARM devices which probably behave differently in terms of frequency scaling.
                  No, you're wrong.
                  Schedutil was precisely developed as the special governor with a focus on latency, in that is the only one that takes C-states into account while deciding which task|thread to place on which core:
                  Schedutil tries to place pending jobs on cores which are sleeping in the lightest sleep state, i.e. the ones which can react the fastest.

                  Keeping the latency to an absolute minimum is the prime focus for all of Google's products, be it Android, ChromeOS or Stadia.
                  That's why all of them use a Linux kernel with a config similar to Ubuntu's "lowlatency" variant, so 1000 Hz kernel tick + full kernel preemption (PREEMPT).

                  Unfortunately the same can't be said for desktop Linux; apart from Ubuntu Studio, no other distro that I'm aware of caters to achieving the lowest latency possible by default.

                  Of course Google tries to achieve the best possible user expierence, since they have a worldwide customer base in the billions (mainly thanks to Android, of course).
                  Also, they are competing with Apple in the smartphone space, which mainly focus on shiny graphics & smooth animations (even if the rest of the OS is as slow as it gets).

                  But alas, a good & smooth user expierence is what counts for most people, and is one of the many reasons why desktop Linux will never achieve a marketshare of even just 5-10 %.
                  Sad, but very much true...

                  Comment


                  • #29
                    Originally posted by Linuxxx View Post

                    No, you're wrong.
                    Schedutil was precisely developed as the special governor with a focus on latency, in that is the only one that takes C-states into account while deciding which task|thread to place on which core:
                    Schedutil tries to place pending jobs on cores which are sleeping in the lightest sleep state, i.e. the ones which can react the fastest.
                    Where did you get this information? Schedutil is a frequency scaling governor. It's job is to decide at what clock speed should a CPU core run to achieve optimal performance and power consumption. It doesn't deal with task scheduling. You seem to be confusing schedutil with CFS, BFS, MuQSS and the likes. Schedutil uses information from the task scheduler to make better decisions but it doesn't replace it.

                    Comment


                    • #30
                      Originally posted by piorunz View Post
                      I think I will just stay on 5.10 LTS kernel until at least 5.12 or possibly longer, will monitor the situation. There is nothing significant in 5.11 which would warrant it's use instead of 5.10, really, for me at least - using Ryzen 1600X on my main machine. No need to fall into regressions or other complications.
                      Or you can just use ondemand or performance governors. Works well for me.

                      Comment

                      Working...
                      X