Announcement

Collapse
No announcement yet.

ORC Unwinder For Linux 4.14, Boosts Kernel Performance By Disabling Frame Pointers

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by ext73 View Post
    My kernel builds ... as always

    CONFIG_FRAME_POINTER is not set

    [NeteXt'73]
    in the days of the 32 bit x86, frame pointers took away 1 register out of 5, and this was a real perf impact.

    on 64 bit x86 there's many more general purpose registers and the perf impact of frame pointers, while not zero, is not nearly as high as it used to be.

    Comment


    • #12
      Originally posted by arjan_intel View Post

      in the days of the 32 bit x86, frame pointers took away 1 register out of 5, and this was a real perf impact.

      on 64 bit x86 there's many more general purpose registers and the perf impact of frame pointers, while not zero, is not nearly as high as it used to be.
      There were 8 GPRs on 32-bit x86 AFAIK, not 5 … anyway, they aren't arguing about register pressure here (which certainly is less of a problem these days with 16 GPRs on x86-64, and even less on RISC architectures with 32 or more GPRs) but instruction cache pressure and unnecessary computation from function prologues/epilogues or calling convention.

      Comment


      • #13
        Originally posted by arjan_intel View Post

        in the days of the 32 bit x86, frame pointers took away 1 register out of 5, and this was a real perf impact.

        on 64 bit x86 there's many more general purpose registers and the perf impact of frame pointers, while not zero, is not nearly as high as it used to be.
        That's right 32/64 bit ... but removing each additional overhead results in increased performance + additional optimizations and you have this result - AMD FX 8320 + Nvidia GT 730 2 GB 64 bit - Kubuntu 16.04.3 + our optimization:

        Poniżej działanie gry DiRT Rally [natywna] - z ustawieniami widocznymi na filmiku - na podanej konfiguracji:The following game DiRT Rally [native] - with the...
        Last edited by ext73; 04 September 2017, 03:29 PM.

        Comment


        • #14
          Originally posted by ext73 View Post
          My kernel builds ... as always

          CONFIG_FRAME_POINTER is not set

          [NeteXt'73]
          Yeah but this patch lets you run a debug build and do real-time instrumentation and analysis with somewhere close to 1% of the performance impact compared to before the patch. That's a big deal.

          This is going to help compiler writers create better optimisers, (in particular PGO) it's going to improve the speed of live kernel patching, it's going to speed up IDEs so developers can work faster, it's going to be more accurate so developers waste less time guessing at call stack data, it allows building a debug kernel without CONFIG_FRAME_POINTER so that distros like UBUNTU that have automated crash reporting perform 2-10% faster on certain apps such as those using sqlite, it will lower the overhead of various reporting tools like netperf. It may allow some kernel subsystems to auto-tune better which could have power management and performance benefits.

          If you don't run a debug kernel and you don't participate in providing feedback to the developers then yeah, it will hardly matter to you today, but it's still going to affect you (positively and without you lifting a finder) down the road. So this isn't a meh moment.
          Last edited by linuxgeex; 04 September 2017, 03:42 PM.

          Comment


          • #15
            Originally posted by andrei_me View Post
            Does ORC acronym have any meaning besides joking with dwarf?
            I dunno. Maybe ring the hobbits and ask them. I think it was a vicious rumour started by an ELF. (literally )

            Comment


            • #16
              Originally posted by linuxgeex View Post

              Yeah but this patch lets you run a debug build and do real-time instrumentation and analysis with somewhere close to 1% of the performance impact compared to before the patch. That's a big deal.

              This is going to help compiler writers create better optimisers, (in particular PGO) it's going to improve the speed of live kernel patching, it's going to speed up IDEs so developers can work faster, it's going to be more accurate so developers waste less time guessing at call stack data, it allows building a debug kernel without CONFIG_FRAME_POINTER so that distros like UBUNTU that have automated crash reporting perform 2-10% faster on certain apps such as those using sqlite, it will lower the overhead of various reporting tools like netperf. It may allow some kernel subsystems to auto-tune better which could have power management and performance benefits.

              If you don't run a debug kernel and you don't participate in providing feedback to the developers then yeah, it will hardly matter to you today, but it's still going to affect you (positively and without you lifting a finder) down the road. So this isn't a meh moment.
              See what I wrote above - in our NeteXt'73 Project we want to deliver the highest level of performance, responsiveness, energy efficiency and security to our users. To achieve this we have to collect these 'pebbles' [1%]. Overall, a number of other treatments result in the results you see on my YouTube channel. It seems to me that the rest of Ubuntu users using standard solutions already provide enough of the analytical data.

              Comment


              • #17
                Originally posted by ext73 View Post

                See what I wrote above - in our NeteXt'73 Project we want to deliver the highest level of performance, responsiveness, energy efficiency and security to our users. To achieve this we have to collect these 'pebbles' [1%]. Overall, a number of other treatments result in the results you see on my YouTube channel. It seems to me that the rest of Ubuntu users using standard solutions already provide enough of the analytical data.
                if you're all about performance... would be interested to benchmark against Clear Linux ;-)

                Comment


                • #18
                  Originally posted by mattst88 View Post
                  I don't know for sure, but I doubt it.

                  In case you aren't aware, the name DWARF is itself a joke based on the ELF binary format. https://en.wikipedia.org/wiki/DWARF
                  Damn, we now need Star Wars in there too.

                  Comment


                  • #19
                    Originally posted by CrystalGamma View Post
                    There were 8 GPRs on 32-bit x86 AFAIK
                    i doubt you can call stack pointer "general purpose"
                    out of 8 registers one was stack pointer, one pic register and one frame pointer, leaving only 5 to compiler. amd64 doesn't need pic register, so it has 14 without frame pointer(almost 3 times more than x86)

                    Comment


                    • #20
                      Now it may fit on floppy disk for some stupid.

                      Comment

                      Working...
                      X