Announcement

Collapse
No announcement yet.

Linus Torvalds Hits Nasty Performance Regression With Early Linux 6.8 Code

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Linus Torvalds Hits Nasty Performance Regression With Early Linux 6.8 Code

    Phoronix: Linus Torvalds Hits Nasty Performance Regression With Early Linux 6.8 Code

    It's not too often hearing Linus Torvalds himself raising the alarm bells over performance regressions of the Linux kernel, but that happened this evening with the ongoing Linux 6.8 merge window. Torvalds' AMD Ryzen Threadripper system suddenly was suffering from much longer build times at least as a result of new code for this kernel...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    Someone needs to explain this to me.

    There were some code changes within the kernel and during compilation it was realized that it was taking longer to build than normal.

    They consider this a "regression", i.e. that the final compiled kernel would have a performance regression.

    The implication is that the amount of time it takes to compile the kernel is somehow proportional to the performance the compiled kernel is capable of.

    Is this a correct way of looking at this announcement?

    Comment


    • #3
      Originally posted by sophisticles View Post
      Someone needs to explain this to me.
      Linus eats his own dog food (as he makes it). He merged various patches, compiled the kernel, and then booted into the new kernel to continue his work. His future compiles are slower. He then bisects the patches he merged to identify a likely suspect.

      While compiling a kernel does not exercise all possible kernel code paths (and could always be an anomaly), and additional performance testing is necessary (there is a set of systems doing just that), moving from 22 seconds to 44 seconds for an empty build was significantly noticeable, and Linus was not pleased.

      Comment


      • #4
        I bet you good money that this code only affects Ryzens and work marvelously in Intels….

        Backtracking the bad code leads us to……😈

        Comment


        • #5
          Someone needs to explain this to me.
          Glad you asked the question! Thx.... I was having a hard time figuring out also how one change could affect the compile time so severely also.....Wasn't computing in my brain. Now we know .

          Comment


          • #6
            Originally posted by rclark View Post
            Glad you asked the question! Thx.... I was having a hard time figuring out also how one change could affect the compile time so severely also.....Wasn't computing in my brain. Now we know .
            Problem was tracked down to "sched/cpufreq: Rework schedutil governor performance estimation" so the issue is going to be that the kernel isn't realizing it needs to clock the cpu up to high frequencies when it gets busy. At least on his threadripper on a compile workload. I imagine it varies.

            Comment


            • #7
              It's a bit surprising that nearly four years later he's still relying on the Threadripper 3970X workhorse considering the much faster performance now available especially with the Ryzen Threadripper 7000 series class systems.
              I don't think it is surprising at all. The machine is still very fast and probably fast enough for what it is supposed to do. 22 seconds are not that bad, are they?

              Comment


              • #8
                Originally posted by oleid View Post
                I don't think it is surprising at all. The machine is still very fast and probably fast enough for what it is supposed to do. 22 seconds are not that bad, are they?
                But then he could do it in 21 seconds. Isn't that worth spending $5000 dollars?

                Someone is going to need this: /sarc

                Comment


                • #9
                  Originally posted by NeoMorpheus View Post
                  I bet you good money that this code only affects Ryzens and work marvelously in Intels….

                  Backtracking the bad code leads us to……😈
                  ...lead Linus to 4 commits from Linaro who happens to be an Arm hardware group. (Read all the article :P ) It has nothing to do with AMD so it shouldn't have affected any hardware but Arm to begin with. If it had screwed with performance with POWER 9 CPUs it'd still have to have been reverted cuz someone forgot to check the CPUIDs.

                  Comment


                  • #10
                    Originally posted by oleid View Post
                    22 seconds are not that bad, are they?
                    When you do this a thousand times over (which is what workstations and servers normally are used for!), it quickly adds up.

                    Comment

                    Working...
                    X