Announcement

Collapse
No announcement yet.

Proposed: Allow Building The Linux Kernel With x86-64 Microarchitecture Feature Levels

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by coder View Post
    This is small potatoes. There's no way it can deliver similar benefits to using -march=native, because that implies -mtune=native, which using a slightly better feature level doesn't.
    Thats not about benefiting your localhost build. It is more intended for distro maintainers that currently shipping binary kernels with march=generic. If some abstract Manjaro will shift its support to Haswell+ CPUs, all they would need is building kernel with x86-64-v3 flags. Or some Ubuntu can ship kernel packages with -v2 and -v3 suffix.

    Comment


    • #12
      I don't know enough about kernel building or CPU instructions to say I know better, but wouldn't it make more sense for software to just simply be told at launch time what the available instructions are so it can take advantage of them if able? Build one version that is capable of taking advantage of all instructions the developer implemented, but it only uses what it actually can use. I feel like that's just a little too obvious though, where there's some major reason why that can't be done.

      Comment


      • #13
        Originally posted by schmidtbag View Post
        I don't know enough about kernel building or CPU instructions to say I know better, but wouldn't it make more sense for software to just simply be told at launch time what the available instructions are so it can take advantage of them if able? Build one version that is capable of taking advantage of all instructions the developer implemented, but it only uses what it actually can use. I feel like that's just a little too obvious though, where there's some major reason why that can't be done.
        That's what HWCAPS and the different V levels are for. The problem with the kernel is...well...there really isn't, or, rather, won't be. They just need to add a function to the initrd to get the system's V level and then load the appropriate kernel. Distributions will basically have to build 4 kernels (one for each level since 1 is also generic) to make it fire and forget.

        Comment


        • #14
          Originally posted by skeevy420 View Post

          That's what HWCAPS and the different V levels are for. The problem with the kernel is...well...there really isn't, or, rather, won't be. They just need to add a function to the initrd to get the system's V level and then load the appropriate kernel. Distributions will basically have to build 4 kernels (one for each level since 1 is also generic) to make it fire and forget.
          maybe they can bake this function right into the kernel itself.

          The kernel does have different setup code for different hardware, switched on automatically at runtime and can be disabled via cmdline.

          Adding such a switch probably aren’t that difficult.

          Comment


          • #15
            Originally posted by coder View Post
            ... benefits to using -march=native...
            What was the last phoronix or other test of march=native?

            Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite


            Was 2018.
            Last edited by elatllat; 18 August 2021, 10:17 AM.

            Comment


            • #16
              It's enough that a diagnostic tool verifies which features the CPu supports during the Os installation in order to install the optimized version based on the instructions checked on that cpu.

              Comment


              • #17
                Originally posted by schmidtbag View Post
                I don't know enough about kernel building or CPU instructions to say I know better, but wouldn't it make more sense for software to just simply be told at launch time what the available instructions are so it can take advantage of them if able? Build one version that is capable of taking advantage of all instructions the developer implemented, but it only uses what it actually can use. I feel like that's just a little too obvious though, where there's some major reason why that can't be done.
                That's pretty much the description of dynamic dispatch. You would generally apply it to expensive functions only. You would implement one with several versions and set which one to call on runtime. ICC had that by default IIRC. No idea if current versions of GCC and Clang can do that as well. I think the kernel does it for some hand-written functions in library modules, such as crypto, when the required instruction set is available.

                Originally posted by discordian View Post

                compile your kernel in the initrams and kexec the new build.

                In a way, this is already done for 32bit ARM, which dont have mandatory integer division, and the kernel will detect if it is available and patch out any calls to the software fallback. This is still alot worse than compiling it with intdiv support, because the compiler will expect a function call and cant optimize around that.
                There's currently a mechanism in place to overwrite existing code on runtime. I don't remember the name, sadly. What currently happens is that, for some expensive instrumentation, you don't want an actual feature check, but instead you fill the place where the call would happen with enough nops to overwrite with the function call when you enable it. You can do the same for intdiv: just put in place enough bytes so you can fit either the intdiv or the function call, put the appropriate one on initialization and fill any remaining space with nops.

                Comment


                • #18
                  Originally posted by flower View Post
                  Wouldn't it be better to just enable or disable features like sse3 or avx?

                  Usually only very small parts do profit from such features anyway. Might be feasible to include normal and optimized versions in the same build
                  This is not the case. Remember items like spectre and meltdown. There are different quirk work around you only need in older CPU generations because they are fixed in new cpu generations. Like the code you need in kernel for sse3 that works right is not the same across all CPU generations that support sse3.

                  Comment


                  • #19
                    Originally posted by sinepgib View Post
                    There's currently a mechanism in place to overwrite existing code on runtime. I don't remember the name, sadly. What currently happens is that, for some expensive instrumentation, you don't want an actual feature check, but instead you fill the place where the call would happen with enough nops to overwrite with the function call when you enable it. You can do the same for intdiv: just put in place enough bytes so you can fit either the intdiv or the function call, put the appropriate one on initialization and fill any remaining space with nops.
                    That's CONFIG_JUMP_LABEL , ftrace and other frameworks can use it or require it.

                    Its more than just a buncha-nops, any instruction can be instrumented/replaced. Which is particularly horrible on x86 as instruction lengths are variable and instruction potentially not aligned, means there are alot cases where in the first step an int3 instruction is replacing the first byte, then the remainder is patched, then the first byte is patched. and if the code is executed inbetween a interrupt handler is involved.
                    I had to disable this functionality on realtime applications for safe operation.

                    Comment


                    • #20
                      Originally posted by discordian View Post
                      I had to disable this functionality on realtime applications for safe operation.
                      What happens when disabled? A branch is inserted?

                      Comment

                      Working...
                      X