Announcement

Collapse
No announcement yet.

Clang PGO Shot Down For Now From The Linux Kernel

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Clang PGO Shot Down For Now From The Linux Kernel

    Phoronix: Clang PGO Shot Down For Now From The Linux Kernel

    While Clang PGO support was sent in for Linux 5.14 as part of Clang compiler handling updates for this next kernel version, the functionality was subsequently dropped out and a new pull request issued after criticism from Linus Torvalds and others...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    We just cannot have nice things.

    Comment


    • #3
      The Kernel built with clang is working great here on my systems with LTO, including now on my Raven machine. I've never tried PGO - always seemed like to much of a hassle to create profiles

      Comment


      • #4
        Originally posted by FireBurn View Post
        The Kernel built with clang is working great here on my systems with LTO, including now on my Raven machine. I've never tried PGO - always seemed like to much of a hassle to create profiles
        same here. I don't understand why perf is in favor of clang if it is obviously inferior (according to michaels arguments). Is it possible that Linus is not full aware of the whole scope? But I do understand that one compiler independent solution is preferable

        Comment


        • #5
          Originally posted by CochainComplex View Post

          same here. I don't understand why perf is in favor of clang if it is obviously inferior (according to michaels arguments). Is it possible that Linus is not full aware of the whole scope? But I do understand that one compiler independent solution is preferable
          Linus also argues that using clang’s perf code adds overhead and makes the system slower.
          As a result, what you measure is not the real performance of the system.

          He thought the existing support of linux’s perf on x86 is mature and good enough while on arm it is OK.

          Comment


          • #6
            A later post explains his reasoning a bit better:

            I agree that perf profiling works best on Intel. The AMD perf side works ok in Zen 2 from what I've seen, but needs to be a full-system profile ("perf record -a") to use the better options, and ARM is..

            But with x86 ranging from "excellent" to "usable", and ARM hopefully being at least close to getting better proper profile data, I really think it's the way forward, with instrumentation being a band-aid at best.

            Linus
            So he wants to hold off for a little while to do it right. Reasonable people can disagree with that but it seems like a perfectly sensible position to take, especially when people are free to carry the patch themselves if they want to.

            Comment


            • #7
              In Ukraine, when our corrupt government is slowing down reforms, we say «зрада» (”betrayal”).

              Michael, Unicode input via hex keyboard from macOS is somewhat broken here. E.g. when I try to input guillemets (‘«’ and ‘»’) via ⌥+00ab and ⌥+00bb, ⌥+0 triggers forum hotkeys tutorial

              Comment


              • #8
                Originally posted by nemequ View Post
                So he wants to hold off for a little while to do it right.
                Torvalds is not "holding it off for a little while". He did not give a time line, he is not writing his own implementation, he is not doing anything, but to say no. Google's implementation works and Torvalds is only abusing his position to try and to get Google to write completely new code. Nothing about this is "perfectly reasonable". It is about as narcissistic as Trump's politics. Linux is governed by an autocracy and young people find nothing wrong with it.

                Comment


                • #9
                  As GCC and Clang differ in some ways in collecting and using profiles, I cannot see a problem here with using distinct PGO infrastructure for both compilers (unifying this beteween GCC and Clang would be a nice bonus, but unrealistic to happen anytime soon). It is also better supported, so going the perf-route doesn't seem to be the obvious better option here in my eyes.

                  Comment


                  • #10
                    Originally posted by sdack View Post
                    We just cannot have nice things.
                    Although, realistically, unless you are one of the hyperscalers or have a very specific runtime environment, one does not tend to have workloads for which PGO is going to make a substantial difference (exceptions noted). Google and Facebook, and others, will simple carry these patches for their own kernels for now, as it can save them real money (saving a few thousands of servers and their power is real money, and is even "green"ish).

                    Comment

                    Working...
                    X