Announcement

Collapse
No announcement yet.

ORC Unwinder For Linux 4.14, Boosts Kernel Performance By Disabling Frame Pointers

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Originally posted by Pawlerson View Post
    Now it may fit on floppy disk for some stupid.
    It cuts a few 10s of kB from the kernel at best... floppyies FTW.

    Comment


    • #22
      Originally posted by pal666 View Post
      in the days of the 32 bit x86, frame pointers took away 1 register out of 5, and this was a real perf impact.
      Originally posted by pal666 View Post
      i doubt you can call stack pointer "general purpose"
      out of 8 registers one was stack pointer, one pic register and one frame pointer, leaving only 5 to compiler. amd64 doesn't need pic register, so it has 14 without frame pointer(almost 3 times more than x86)
      No, it's taking one register from out of 7.

      There are 8 gprs, out of which one is a stack pointer, leaving you with 7. One of these (ebp) is historically used as a frame pointer in case you use them (this is the one out of 7 you lose when you build without -fomit-frame-pointer). There is no such thing as a "pic register" (or what did you mean really?).

      Comment


      • #23
        Originally posted by andrei_me View Post
        Does ORC acronym have any meaning besides joking with dwarf?

        Comment


        • #24
          Originally posted by Wielkie G View Post



          No, it's taking one register from out of 7.

          There are 8 gprs, out of which one is a stack pointer, leaving you with 7. One of these (ebp) is historically used as a frame pointer in case you use them (this is the one out of 7 you lose when you build without -fomit-frame-pointer). There is no such thing as a "pic register" (or what did you mean really?).
          in 32 bit, EBX is often used as the PIC register (for EIP relative addressing)

          also ESI/EDI, although GP's, are limited in how they can be used.

          Comment


          • #25
            Originally posted by arjan_intel View Post

            in 32 bit, EBX is often used as the PIC register (for EIP relative addressing)

            also ESI/EDI, although GP's, are limited in how they can be used.
            Heck, even most RISCs don't have 32 real GPRs, one of them is usually zero (RISC-V, MIPS, MSP430, ARMv8). ARMv8 is a little funky and puts the stack pointer in the zero register depending on which instruction you are using.
            Last edited by microcode; 05 September 2017, 12:43 AM.

            Comment


            • #26
              Originally posted by ext73 View Post

              See what I wrote above - in our NeteXt'73 Project we want to deliver the highest level of performance, responsiveness, energy efficiency and security to our users. To achieve this we have to collect these 'pebbles' [1%]. Overall, a number of other treatments result in the results you see on my YouTube channel. It seems to me that the rest of Ubuntu users using standard solutions already provide enough of the analytical data.
              Sorry I don't have time to go watch your video without some very good indications why I should.

              Comment


              • #27
                Originally posted by pal666 View Post
                i doubt you can call stack pointer "general purpose"
                out of 8 registers one was stack pointer, one pic register and one frame pointer, leaving only 5 to compiler. amd64 doesn't need pic register, so it has 14 without frame pointer(almost 3 times more than x86)
                frame pointer != stack pointer.

                Comment


                • #28
                  Originally posted by Wielkie G View Post



                  No, it's taking one register from out of 7.

                  There are 8 gprs, out of which one is a stack pointer, leaving you with 7. One of these (ebp) is historically used as a frame pointer in case you use them (this is the one out of 7 you lose when you build without -fomit-frame-pointer). There is no such thing as a "pic register" (or what did you mean really?).
                  PIC register is ebx%, you give up a register to use as a program base pointer so that all jumps can be register-relative, hence PIC. PIC has advantages that offset the cost of losing a register for code generation. ie no modification to the text page during linking so you can have randomized load locations per library per process while still sharing the blocks unmodified in memory. So your code may be 3% longer but 12% more is shared. x86 has a lot more than 8 GPR's thanks to register renaming in modern x86 processors. The same renamed register can be used simultaneously for multiple ops, so losing 1 in 7 isn't anywhere near as big of a loss as it sounds.
                  Last edited by linuxgeex; 05 September 2017, 03:55 AM.

                  Comment


                  • #29
                    Originally posted by debianxfce View Post
                    In general, a custom non debug 1000Hz timer kernel is much faster than stock distribution kernels.
                    Much faster? When did you last benchmarked that?

                    Comment


                    • #30
                      Originally posted by debianxfce View Post

                      Do benchmarks yourself.
                      Your claim, your duty to prove it.

                      Originally posted by debianxfce View Post
                      A custom non debug kernel boots 3 seconds faster in my computer (x4 845 with sata ssd). In Tomb Raider 2013 win version benchmark I see 1-2 fps increase when using wine-staging. Test with a Amlogic S912 device too, there you will see a huge difference.



                      16. Don't use debug kernels. Debug kernels are slow.
                      When the kernel is both custom and non-debug, you can't automatically pin all performance gains on removing the debug info.

                      Comment

                      Working...
                      X