Announcement

Collapse
No announcement yet.

Deferred Enabling Of ACPI CPUFreq Boost Support Can Help Boot Times For Large Servers

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Deferred Enabling Of ACPI CPUFreq Boost Support Can Help Boot Times For Large Servers

    Phoronix: Deferred Enabling Of ACPI CPUFreq Boost Support Can Help Boot Times For Large Servers

    With CPU core counts continuing to rise, we've seen various optimization efforts in recent times to help with the boot speed of getting large servers online. One of the latest discoveries can trim down the boot speed by up to 30 seconds for some large servers and what appears to be a next-generation AMD EPYC "Genoa" platform...

    https://www.phoronix.com/news/CPUFreq-Defer-Boost-MSRs

  • #2
    Hmm. If the costs are O(n), 1/192 of 30s is... not small.

    Comment


    • #3
      Originally posted by yump View Post
      Hmm. If the costs are O(n), 1/192 of 30s is... not small.
      Speaking generally, a lot of code that we say is O(N) is actually time wise T(c + fN), where c represents some constant that covers things like fixed setup cost etc and f is some constant scaling factor. However, big O notation by design/definition only includes the fastest growing component.

      (Side note: this is why a worse scaling algorithm can actually be faster for small sizes because it has lower fixed cost or smaller constant scaling factor. Then there is stuff like cache behaviour and branch prediction etc that might, for example, make a naive linear search faster than a binary search for very small N.)
      ​​​​
      In this case I would be surprised if you would gain 8/192 of 30s on an 8 thread cpu even if the scaling is linear.

      Comment


      • #4
        Originally posted by Vorpal View Post

        Speaking generally, a lot of code that we say is O(N) is actually time wise T(c + fN), where c represents some constant that covers things like fixed setup cost etc and f is some constant scaling factor. However, big O notation by design/definition only includes the fastest growing component.

        (Side note: this is why a worse scaling algorithm can actually be faster for small sizes because it has lower fixed cost or smaller constant scaling factor. Then there is stuff like cache behaviour and branch prediction etc that might, for example, make a naive linear search faster than a binary search for very small N.)
        ​​​​
        In this case I would be surprised if you would gain 8/192 of 30s on an 8 thread cpu even if the scaling is linear.
        If constant factors dominate on small-thread-count machines, that means the time spent enabling boost is even longer!

        I haven't dug into this enough to know how it works, but hopefully this patch makes the whole business asynchronous and takes it out of the critical path.

        Comment


        • #5
          Originally posted by yump View Post

          If constant factors dominate on small-thread-count machines, that means the time spent enabling boost is even longer!

          I haven't dug into this enough to know how it works, but hopefully this patch makes the whole business asynchronous and takes it out of the critical path.
          I ended up taking a look at this. It looks like it is only a save on systems lacking P-states? Maybe it is a server thing where they run full tilt all the time. Doesn't look like the actually improved at all on the sequential behaviour, which would have been nice to see.

          Comment

          Working...
          X