No announcement yet.

Upstream Linux Developers Against "-O3" Optimizing The Kernel

  • Filter
  • Time
  • Show
Clear All
new posts

  • Upstream Linux Developers Against "-O3" Optimizing The Kernel

    Phoronix: Upstream Linux Developers Against "-O3" Optimizing The Kernel

    The upstream Linux kernel developers have come out against a proposal to begin using the "-O3" optimization level when compiling the open-source code-base with the GCC 10 compiler or newer...

  • #2
    Some testing would be great to have some hard facts at hand. I do compile my kernel with -O3 and even more aggressive flags like march=native and Graphite. And it serves me well.


    • #3
      lto and pgo might have a more profound effect on speed. but i guess getting good profile data for the kernel is hard.


      • #4
        There's a reason -O3 isn't even offered as an option. Maybe things have changed, and maybe they've improved. But I'd like to see actual numbers for something like this.
        To me that sounds like Linus needs Michael to get the numbers


        • #5
          With all the work going on on the compiler front over the years and also performance work on the Kernel I would have guessed that both projects test the impact of different compiler options from time to time. From reading that thread this doesn't seem to be the case at all and left to the users to find out.


          • #6
            Testing. Testing. Testing.

            While not a huge priority, I think one should never abandon the possibility of increasing performance overall.


            • #7
              I cannot find the source for that now but if I remember correctly even the GCC developers themselves have always treated -O3 as experimental. So, please, no.

              To be honest a kernel which works well, should not even register in real life benchmarks for average people unless you have a single core CPU and you're shuffling 20 tasks simultaneously or running some tests which e.g. create/free millions of small chunks of memory or create/destroy multiple short-lived threads. And all these scenarios are very unnatural and though they might occur in real life they cannot be widespread.

              For HPC and servers the Linux kernel may indeed do a lot of internal work (think of raid checksumming, networking, firewalling, shuffling thousands of web server threads, etc) but then these workloads are very specific and should be benchmarked separately. You may go as far as to to compile only certain parts of the kernel with -O3 while leaving everything else at -O2.

              And "testing" which has been mindlessly mentioned above will not solve the issue of extreme code bloat which is the other name of -O3 which could spell a death-knell for CPUs with relatively small L2/L3 caches. You could test -O3 on Ryzen 9 3950X (64MB of L3 cache, enough to run entire Windows 95 from it) and decide it's worth it and then someone could have a Celeron CPU with just 2MB of L3 cache and his experience would be wildly different.


              • #8
                Who uses the upstream kernel? Don't distros compile it themselves?


                • #9
                  Originally posted by bug77 View Post
                  Who uses the upstream kernel? Don't distros compile it themselves?
                  I have been compiling the kernel since ... forever. The distro one (I use Fedora) is bloated as hell and is compiled using a ton of safety options which slow it down. I only use Fedora's kernel on my laptop where secure boot is enabled: too lazy to sign each new release and all the modules. And then BIOS updates remove my custom certificate and that's a new round of hassle.


                  • #10
                    The only flaw in logic in see in your argument is that not testing and trying different optimizations is like saying Clear Linux shouldn't be experimenting with optimizations that may or may not be beneficial and then working their way from there. With the wide variety of hardware that exists these days, holding back for compatibility's sake isn't necessarily the best solution. It's like me getting too upset that my old BIOS system can't use systemd-boot to make my Linux life so much easier because systemd-boot decided to support newer standards and left users of older standards to the tools that worked for them.

                    At the distribution level it could be "as simple" as offering an additional opt-in kernel-optimized package. For other scenarios and situations, we have to basically cross our fingers and hope that the maintainer(s) is qualified enough to make the right calls and decisions when configuring everything whether it be the kernel or audio system or an Android rom (Official or Unofficial...doesn't really mattter).

                    But, to solve the issue of extreme code bloat you mention, I've always thought some sort O3 w/o the loops option would be a decent compromise based on all the anecdotal and random stuff I've read over the years. Make something like that the new O3 and bump up the "old" O3 to O4. As to what to use or go with....not sure, been going with O2 and march=native since it works pretty good. I'll leave that up to people much smarter in the ways of compilers since, for all I know, that could be the dumbest Idea ever suggested .