Announcement

Collapse
No announcement yet.

Benchmarking The Linux Kernel With An "-O3" Optimized Build

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by CochainComplex View Post
    Mh yes. But there was so far no sudden apocalyptic showstopper as speculated either.
    His argument was not that it breaks all the time. Roughly, his argument was that it may trigger or create bugs sometimes/randomly with unclear benefits. Again, with the sidenote that nobody stops you from building the Kernel with O3. It is just not an official option.

    Comment


    • #12
      Originally posted by V1tol View Post
      I think having -march=native gives more performance gains, than -O2 => -O3 transition.
      It surely does, but it also reduces CPU compatibility significantly (whereas -O3 doesn't). Ideally, you should use both, but providing kernel binaries for all CPU combinations would require too much time/effort for most distributions. Providing binaries for common CPU feature sets (<= SSE2, <= SSE4.2, <= AVX2) could be a decent middle ground, but that's still 3× more binaries than you'd usually distribute.

      Comment


      • #13
        Originally posted by milkylainen View Post
        I think the results should be interpreted otherwise.
        It's not strange that you don't see any benefits from something that spends < 1% cpu time in kernelspace.
        If you're measuring an entire system then yeah, perhaps not that much difference.

        But as the synthetic tests measuring syscalls or actual kernel operations (like context switching)...
        Then wow did that O3 flag mean improvement!

        In summary: It's unfair to say that it doesn't help the kernel when measuring an entire system.
        I speculate the result of context switching might be wrong.

        It gives such a big improvement that I suspect that it involves some bugs.
        For example, maybe it somehow removes the flushing/software migration for spectre/meltdown?

        Comment


        • #14
          I'm honestly surprised to see that there were any benefits at all. Some of them even looked compelling for large web service providers.

          Comment


          • #15
            Originally posted by NobodyXu View Post

            I speculate the result of context switching might be wrong.

            It gives such a big improvement that I suspect that it involves some bugs.
            For example, maybe it somehow removes the flushing/software migration for spectre/meltdown?
            You might be right, it's worth exploring because if O3 causes this benchmark to go wild, it could also be that the code that's being stressed is written in such a way that GCC misunderstands how to optimize it. It could be that if an issue is discovered and it's fixed, maybe we get the best of both worlds, a performance improvement and correct behavior.

            Comment


            • #16
              Originally posted by -MacNuke- View Post
              with unclear benefits
              I think the benefits are pretty clear now and wider testing will help to fix these bugs.
              ## VGA ##
              AMD: X1950XTX, HD3870, HD5870
              Intel: GMA45, HD3000 (Core i5 2500K)

              Comment


              • #17
                Originally posted by V1tol View Post
                I think having -march=native gives more performance gains, than -O2 => -O3 transition.
                not really, because the kernel doe not use any SIMD, so -march=native does next to nothing - mostly insn scheduling, ... ;-)

                Comment


                • #18
                  Originally posted by milkylainen View Post
                  I think the results should be interpreted otherwise.
                  It's not strange that you don't see any benefits from something that spends < 1% cpu time in kernelspace.
                  If you're measuring an entire system then yeah, perhaps not that much difference.

                  But as the synthetic tests measuring syscalls or actual kernel operations (like context switching)...
                  Then wow did that O3 flag mean improvement!

                  In summary: It's unfair to say that it doesn't help the kernel when measuring an entire system.
                  Basically in line with how I predicted it would be back in that other thread where we can see that applications that spends lots of time in the kernel like PostgreSQL gets quite a big benefit vs designs that spend most of their time in user-space like RocksDB and Redis.

                  Comment


                  • #19
                    When it came to the -O3 kernel build for other workloads like gaming/graphics, web browsing performance, and various creator workloads there was no measurable benefit from the -O3 kernel.
                    Just what I expected. A well written kernel must have next to zero impact on performance other than using the CPU intensive features the kernel itself provides, i.e. encryption, connections, context-switching, etc. which is not absolute most users ever deal with.

                    Case closed.

                    Comment


                    • #20
                      Originally posted by birdie View Post

                      Just what I expected. A well written kernel must have next to zero impact on performance other than using the CPU intensive features the kernel itself provides, i.e. encryption, connections, context-switching, etc. which is not absolute most users ever deal with.

                      Case closed.
                      Disagree. A lot of users use zram/zswap and/or LUKS, maybe something like Wireguard, so any gains in that area are a big win. Sadly, none of those were tested here, but my guess is O3 might help for those workloads. In that case it would make sense to optimize those parts more aggressively than the rest. The only reason it can't be easily done is that the bernel build system doesn't support it. It's all or nothing.

                      Comment

                      Working...
                      X