Announcement

Collapse
No announcement yet.

Linux 6.6 Delivers Some Impressive Gains For AMD EPYC 9754 "Bergamo" Server Performance

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by Kjell View Post
    CachyOS has been using EEVDF with BORE tweaks for quite some time now
    For whatever reason their EEVDF kernel gives me random resets under no load. Like, after a fresh boot and login I'll click the KDE Start menu and end up seeing my UEFI initialize.

    EDIT: I think I was undervolting my CPU a hair too much and the change from CFS to EEVDF was enough to trigger a crash. I changed the PBO Curve Optimizer from from -40 to -35 to -30 and it stopped hard resetting out of nowhere. Odd that it didn't happen with CFS and -35 or -40 but it does with EEVDF.
    Last edited by skeevy420; 14 September 2023, 05:43 PM.

    Comment


    • #12
      Now let's see how long it takes for enterprise distros to pick this up ...

      Comment


      • #13
        Originally posted by Kjell View Post
        CachyOS has been using EEVDF with BORE tweaks for quite some time now
        Indeed. I suspect BORE would be optimal for gaming oriented users.

        Comment


        • #14
          yep, I benchmarked that earlier for the eevdf scheduler: https://youtube.com/live/4IFfRQs_zeM

          Comment


          • #15
            Any chance we could get just the Postgres benchmarks done with the SRSO mitigation disabled? That was one of the workloads significantly affected and I remember there were some fix up patches in 6.6, so wondering how much of those huge improvements are attributed to that.

            Comment


            • #16
              Originally posted by avis View Post
              Would be nice to compare CFS with EEVDF in terms of the number of lines of code.

              Could be that EEVDF allows these massive gains because it's literally 10 times less code.

              That's not all of course, various loops and logic are equally important. EEVDF could be just more streamlined and logical.
              CFS was not a horribly complex scheduler. In fact this was one of the reasons it was chosen to replace the previous scheduler as it was too complex.

              Comment


              • #17
                Originally posted by Errinwright View Post

                Indeed. I suspect BORE would be optimal for gaming oriented users.
                BORE is not completely focused on Gaming. Mainly it does aim to improve the responsive under heavy load and other desktop scenarios.
                But yes, we use it as default in our kernel, since we barely have any regressions with it and if we find one, the developer, mu is reacting really fast and does bring a fix/improvement.
                He takes feedback really serious and we are working since a long time together with him.

                little addition: We --> CachyOS

                Comment


                • #18
                  I find it questionable, if these improvements are coming completly from EEVDF itself.
                  The benchmarks we have done with EEVDF did not show in our benchmark suite such a improvement and also following lkml it did show some little regressions in terms of throughput.
                  I think other commits are also affecting this, also SRSO got cleaned up and lowered the overhead of it.

                  Comment


                  • #19
                    Originally posted by ptr1337 View Post
                    I find it questionable, if these improvements are coming completly from EEVDF itself.
                    The benchmarks we have done with EEVDF did not show in our benchmark suite such a improvement and also following lkml it did show some little regressions in terms of throughput.
                    I think other commits are also affecting this, also SRSO got cleaned up and lowered the overhead of it.
                    No, these definitely do not come from EEVDF alone. I suspect the major gains are coming from the changes to the work queues. The idea of the work queue changes is to improve the locality for L3 caches.

                    So does the AMD Ryzen 7 3800X (Zen2) have 16 cores with 32MB of L3 cache, but these are actually two 16MB L3 caches and only half of the 16 cores use either one of these 16MB L3 caches. The Ryzen 7 5800X (Zen3) also has 16 cores with 32MB L3 cache, but here all 16 cores share the full 32MB of a single L3 cache. This is one of the reasons why Zen3 is so much faster than Zen2.

                    The changes to the work queues now take these differences in L3 cache designs into account by increasing the cache locality. So now threads stay more often on the core they were last on. However, this also has consequences for the L2 and L1 caches obviously as these will also see fewer cache misses. This may explain why in particular the nginx benchmark sees such a huge gain in performance. A run with perf should easily show this.

                    I am quite thrilled over these overall gains for gaming. While there are games that run as fast or faster on Linux, are there many games that run faster under Windows still. The new gains by kernel 6.6 could be a major win for gaming on Linux, because it not only could put many of the close wins for Windows on par with Linux (or better), but an improved scheduler will be a blessing for the responsiveness of games.
                    Last edited by sdack; 15 September 2023, 12:24 PM.

                    Comment


                    • #20
                      Originally posted by skeevy420 View Post
                      EDIT: I think I was undervolting my CPU a hair too much and the change from CFS to EEVDF was enough to trigger a crash. I changed the PBO Curve Optimizer from from -40 to -35 to -30 and it stopped hard resetting out of nowhere. Odd that it didn't happen with CFS and -35 or -40 but it does with EEVDF.
                      Unless you have the best golden sample out there, - 30 is probably not stabile. I have had errors with - 20 in prime95 and am now back to -15 which I haven't tested extensivey.

                      All core was stabile for me with - 20 but when i tested Single thread with and w/o HT i got errors. I recommend to test a few different threaded scenarios.
                      ​​​​​​

                      Comment

                      Working...
                      X