Announcement

Collapse
No announcement yet.

The Performance Impact Of GCC CPU Tuning On The Linux Kernel's Performance

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • The Performance Impact Of GCC CPU Tuning On The Linux Kernel's Performance

    Phoronix: The Performance Impact Of GCC CPU Tuning On The Linux Kernel's Performance

    Last week there was the patch being proposed for the mainline Linux kernel that has long been carried by Gentoo's kernel to provide CPU optimization options, which were quickly shot-down by upstream maintainers, there were many requests to benchmark said patches... Here are dozens of performance figures looking at the performance impact of these optimizations for AMD Zen (znver1), Skylake, and Skylake X (Skylake-AVX512) compared to a stock mainline kernel build on several different systems.

    http://www.phoronix.com/vr.php?view=27585

  • #2
    You're comparing optimisation levels on machines which are deeply OoO and have huge instruction windows and/or reorder buffers. The results are as expected, obviously. Now, I'd love to see the same comparison on an old Atom (Diamondville, Pineview or Cedarview, all Bonnell microarchitecture) system and on the Raspberry Pi 1/Zero.

    Comment


    • #3
      Thank your for the numbers, albeit I would have hoped for more of a significant difference.

      Comment


      • #4
        Thank you for the much needed benchmark on this. It's a bummer but knowledge is power in the end. Now I won't feel like I'm missing anything if I don't apply it.

        Comment


        • #5
          Where was the march=native setting?

          Comment


          • #6
            I wonder whether older CPUs with more mature GCC optimizations yield better performance gains. The frame time test is probably the most interesting one, as it would suggest slightly smaller latency on optimized kernels, but other tests of such kind weren't that conclusive.

            Comment


            • #7
              Interesting to see the test results and slightly odd, because I've performed several of these tests actually on my AMD Ryzen 2700X build and had different results for Apache running a "-march=" kernel. Then again I'm running Gentoo and my entire system has been optimized to run with "-march=znver1" and "-O2 -pipe" and did a stage1 bootstrap. That makes me wonder about test results when your entire toolchain and library setup is optimized whether the results would be any different. That would be interesting to see in a test with all the tests performed and see what the performance results would be like.

              Comment


              • #8
                So basically

                https://i.imgur.com/8rVQV1f.mp4

                Comment


                • #9
                  Thanks for testing.

                  Comment


                  • #10
                    Originally posted by HyperDrive View Post
                    You're comparing optimisation levels on machines which are deeply OoO and have huge instruction windows and/or reorder buffers. The results are as expected, obviously. Now, I'd love to see the same comparison on an old Atom (Diamondville, Pineview or Cedarview, all Bonnell microarchitecture) system and on the Raspberry Pi 1/Zero.
                    I tested this some years ago on AMD Kabini so Jaguar arch and got same results like Michael is getting, somewhere goes up a bit and somewhere also flop

                    Not needed really

                    Comment

                    Working...
                    X