Announcement

Collapse
No announcement yet.

Benchmarking The Linux 5.19 Kernel Built With "-O3 -march=native"

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #51
    Originally posted by mlau View Post

    To me it is .. I'd have expected that the usage of all the additional instruction set extensions which came out since the original K8 was designed (bmi1/2, movbe in particular) had a more positive impact. Or the gcc tuning model for alder lake is simply garbage.
    As someone else already commented, maybe rerun this test on icelake or one of the skylake derivatives and maybe a zen3, so the picture becomes a bit clearer.
    As already mentioned by Anux, Michael did a test in October 2021 pairing GCC11/12 with AMD Zen3 -O2/-O3/-O3+native/O3+native+ftlo.

    The results were in a nutshell: O3+native better that O3 better than O2.
    On a Zen3: recompiling with O3+native makes a considerable impact above O2 and using GCC12 above GCC11.

    See Michaels article here for details.

    As the article is already nearly a year old I wonder if the results might have improved even more in the meantime.

    This is also important why, when Michael last tested custom kernels, I asked about if he used xanmod-generic or xanmod-x64v2, as Zen really benefits from using a kernel with newer instruction codes and leaves untapped potential if just using the generic one, which nobody would do when using a custom kernel to begin with, would he?

    Comment


    • #52
      Originally posted by reba View Post
      This is also important why, when Michael last tested custom kernels, I asked about if he used xanmod-generic or xanmod-x64v2, as Zen really benefits from using a kernel with newer instruction codes and leaves untapped potential if just using the generic one, which nobody would do when using a custom kernel to begin with, would he?
      Cool, didn't know about the v2 version of custom cores. I always have a choice between v1 and v3, but my processor is v2.
      I would like more packages for Arch v2.

      Comment


      • #53
        Originally posted by reba View Post

        The results were in a nutshell: O3+native better that O3 better than O2.
        On a Zen3: recompiling with O3+native makes a considerable impact above O2 and using GCC12 above GCC11.
        I tried it myself: the .text (code) section of "-O2 -march=znver3" is 25% smaller than plain "-O2" (24MB instead of 32MB), but the overall kernel binary
        is 6MB larger (42MB compared to 36MB) due to an 7x increase in .data section size. I guess the reduced icache footprint does help zen performance quite a bit,
        and the large-ish data caches compensate for the the larger static data portion. Haven't benchmarked it though.

        Still i'd expect alder lake to reap similar benefits, at least on the big cores.

        Comment


        • #54
          Originally posted by Michael View Post

          Yep it's all open source. People are lazy?
          Panem et circenses!

          Comment


          • #55
            Originally posted by cj.wijtmans View Post
            Personally i use -Os -march=native in servers.
            -Os is really really bad. You may want to reconsider your decision Michael has tested it on several occasions and the performance loss at times is jarring.

            -Os is primarily used by embedded, it shouldn't be used on CPUs with enough cache.

            Comment


            • #56
              Originally posted by AdrianBc View Post


              The fact that compiling with more optimizations has little effect on the kernel, which contains mostly unstructured code, is not a surprise at all.


              On the other hand there are other applications, which spend most of their time in loops iterating over regular data structures, where the compiler optimizations and the selection of the appropriate instruction set can make a very large difference in speed and where the compilation from source in Gentoo provides a noticeable improvement.

              I normally use Gentoo on my laptops and desktops, but when I happen to use an Ubuntu or a Fedora on the same hardware, e.g. before installing Gentoo, they feel definitely more sluggish.


              Even if I always compile a custom kernel configuration, for the kernel I have never used other compilation flags except the default, because I agree that an improvement over that is unlikely.
              Any data to back it up? Benchmarks, whatever? Are you sure the setups are exactly the same? I mean you may have fined tuned your Gentoo installation (removed/stopped various services, disabled swap) yet you use Fedora/Ubuntu by default.

              Comment


              • #57
                Originally posted by Anux View Post

                Since your google seems defect https://www.phoronix.com/scan.php?pa...ber-Zen-3-Perf there you go.
                Thanks. The same regression doesn't seem to happen on Ryzens.

                Comment


                • #58
                  Originally posted by birdie View Post

                  Any data to back it up? Benchmarks, whatever? Are you sure the setups are exactly the same? I mean you may have fined tuned your Gentoo installation (removed/stopped various services, disabled swap) yet you use Fedora/Ubuntu by default.

                  I have no doubt that all the standard distributions that I have tried are slower, but I have never bothered to investigate the cause, to see how much, if any, comes from compilation options. On the programs that I write myself, the compilation options have a very large influence on performance, but those are exactly the kind of programs where this is expected, which spend much time with array operations and loops. In the kernel, it is likely that more time is spent waiting for memory accesses to be completed than actually executing instructions, so there are few places where the quality of the code generated by the compiler can make much difference.


                  Of course, there are several very important contributors to the relative slowness of the standard distributions versus the Gentoo that I am using, which have nothing to do with the optimized compilation in Gentoo but are peculiar to the system configuration that I happen to use. For example, after I install Gentoo, the system boots much faster because I use a custom kernel configuration instead of the generic kernel, and it is likely that many GUI applications seem more responsive because I use XFCE instead of the Gnome or KDE used in most standard distributions.

                  In my opinion the best desktop environment ever has been KDE 3.5, but the team who took over the development of KDE has destroyed it, so after it became too painful to maintain the old KDE 3.5, I have switched to XFCE, which has less features, but it is flexible enough to be customized in the way that I like and it appears to have lower overhead than the more complex Gnome/KDE.


                  In any case, the main reason while I compile almost all programs that I use from source is not the hope that this might make them faster, but because I want to be able to modify them whenever I want them to behave differently and to be able to discover the cause of any mysterious bug or unexpected behavior. On Windows, I have experienced many cases when even a team of IT support people working for a couple of weeks has been unable to solve some mysterious bugs, but on Linux, due to the availability of the source, I have never seen a problem that cannot be solved in a reasonable time.
                  Last edited by AdrianBc; 15 July 2022, 12:50 AM.

                  Comment


                  • #59
                    Originally posted by birdie View Post
                    Poor Gentoo users who have always insisted on building everything with -march=native. LOL.

                    Oh, and the -O3 kernel is so much faster faster than -O2, the whole percent, and that's on a CPU with lavish L2/L3 caches. Wow. So many comments earlier, so much pain.
                    When you consider how much of the CPU time is spent in the kernel vs userspace, a whole percent is quite a lot.

                    Comment


                    • #60
                      Originally posted by birdie View Post

                      -Os is really really bad. You may want to reconsider your decision Michael has tested it on several occasions and the performance loss at times is jarring.

                      -Os is primarily used by embedded, it shouldn't be used on CPUs with enough cache.
                      What if i dont need performance. I never said what the server is used for.

                      Comment

                      Working...
                      X