Announcement

Collapse
No announcement yet.

-O3 Compiler Optimization Level Still Deemed Too Unsafe For The Linux Kernel

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • perpetually high
    replied
    Originally posted by sdack View Post
    The way I always do it is with a script. I have been doing this for decades now and a script serves me as a notepad but also to make it simple. The gist of it is this:

    Code:
    pushd /usr/src/linux
    make LLVM=1 LLVM_IAS=1 distclean
    cp ../latest-config .config # copies the last good configuration over
    make LLVM=1 LLVM_IAS=1 oldconfig prepare
    make LLVM=1 LLVM_IAS=1 all
    make LLVM=1 LLVM_IAS=1 install_modules # for Arm do I also add a "install_dtbs" here
    make LLVM=1 LLVM_IAS=1 install # for Arm do I use "zinstall" as the target instead
    
    pushd /usr/src/nvidia-xxx.yy # compiling the Nvidia module next
    export IGNORE_CC_MISMATCH=1
    make LLVM=1 LLVM_IAS=1 clean
    make LLVM=1 LLVM_IAS=1 KERNEL_UNAME="5.12.12" modules
    make LLVM=1 LLVM_IAS=1 KERNEL_UNAME="5.12.12" modules_install
    
    # whatever else you want to compile comes next, i.e. out-of-tree device drivers
    # ...
    
    popd # Nvidia
    popd # linux kernel
    
    update-initramfs -u -k 5.12.12 # update the initramdisk
    update-grub # update grub to include the new kernel
    
    # On Arm do I also run "mkimage" to update the boot script so it boots the new kernel with dtb and initramdisk
    That is the short version. Hope you can pick out what you need.
    Thank you my good sir! Will give it a look and add to my script. Cheers : )

    TIL about pushd, jesus. How'd I not know about this before

    Leave a comment:


  • sdack
    replied
    Originally posted by perpetually high View Post
    So it finished compiling, this was with make -j4 all:

    LD [M] sound/usb/line6/snd-usb-variax.ko
    LD [M] sound/usb/misc/snd-ua101.ko
    LD [M] sound/usb/usx2y/snd-usb-us122l.ko
    LD [M] sound/usb/usx2y/snd-usb-usx2y.ko
    LD [M] sound/x86/snd-hdmi-lpe-audio.ko
    LD [M] sound/xen/snd_xen_front.ko
    linux-5.12.12 ❯ ls
    arch crypto Documentation include kernel mm scripts tools COPYING Kconfig modules.builtin modules.order System.map vmlinux.symvers
    block debian drivers init lib net security usr CREDITS MAINTAINERS modules.builtin.modinfo Module.symvers vmlinux
    certs debian.master fs ipc LICENSES samples sound virt Kbuild Makefile modules-only.symvers README vmlinux.o

    I'm not sure what to do now. ...
    The way I always do it is with a script. I have been doing this for decades now and a script serves me as a notepad but also to make it simple. The gist of it is this:

    Code:
    pushd /usr/src/linux
    make LLVM=1 LLVM_IAS=1 distclean
    cp ../latest-config .config     # copies the last good configuration over
    make LLVM=1 LLVM_IAS=1 oldconfig prepare
    make LLVM=1 LLVM_IAS=1 all
    make LLVM=1 LLVM_IAS=1 install_modules     # for Arm do I also add a "install_dtbs" here
    make LLVM=1 LLVM_IAS=1 install     # for Arm do I use "zinstall" as the target instead
    
    pushd /usr/src/nvidia-xxx.yy     # compiling the Nvidia module next
    export IGNORE_CC_MISMATCH=1
    make LLVM=1 LLVM_IAS=1 clean
    make LLVM=1 LLVM_IAS=1 KERNEL_UNAME="5.12.12" modules
    make LLVM=1 LLVM_IAS=1 KERNEL_UNAME="5.12.12" modules_install
    
    # whatever else you want to compile comes next, i.e. out-of-tree device drivers
    # ...
    
    popd     # Nvidia
    popd     # linux kernel
    
    update-initramfs -u -k 5.12.12     # update the initramdisk
    update-grub     # update grub to include the new kernel
    
    # On Arm do I also run "mkimage" to update the boot script so it boots the new kernel with dtb and initramdisk
    That is the short version. Hope you can pick out what you need.

    Leave a comment:


  • perpetually high
    replied
    Originally posted by sdack View Post
    Check your kernel configuration and make sure only to compile those parts as modules that are not essential for you. The less gets compiled as a module the more benefits LTO will give you. Modules will get compiled with LTO, too, but it will only optimise the parts of a module that make the module, but without the full kernel scope. Hence compiling essential parts directly into the kernel allows for more inlining and better optimisations during the final LTO pass.
    Yeah I got all of that stuff tweaked out to the max, appreciate the heads up though.

    So it finished compiling, this was with make -j4 all:

    LD [M] sound/usb/line6/snd-usb-variax.ko
    LD [M] sound/usb/misc/snd-ua101.ko
    LD [M] sound/usb/usx2y/snd-usb-us122l.ko
    LD [M] sound/usb/usx2y/snd-usb-usx2y.ko
    LD [M] sound/x86/snd-hdmi-lpe-audio.ko
    LD [M] sound/xen/snd_xen_front.ko
    linux-5.12.12 ❯ ls
    arch crypto Documentation include kernel mm scripts tools COPYING Kconfig modules.builtin modules.order System.map vmlinux.symvers
    block debian drivers init lib net security usr CREDITS MAINTAINERS modules.builtin.modinfo Module.symvers vmlinux
    certs debian.master fs ipc LICENSES samples sound virt Kbuild Makefile modules-only.symvers README vmlinux.o

    I'm not sure what to do now. Before I was using "fakeroot debian/rules binary-headers binary-generic binary-perarch" to build my kernel and that was giving me .deb's

    I tried using "LLVM=1 LLVM_IAS=1 sudo make-kpkg -j 4 --rootcmd fakeroot --initrd --append-to-version=-lto kernel-image kernel-headers" but that uses GCC

    Can you let me know what the next step is? Thanks again

    EDIT: made some clarifications
    Last edited by perpetually high; 22 June 2021, 10:02 AM.

    Leave a comment:


  • sdack
    replied
    Originally posted by perpetually high View Post
    Hey dude, I just wanted to say thanks, because I'm compiling the kernel as we speak with full LTO and I'm stoked. ...
    Check your kernel configuration and make sure only to compile those parts as modules that are not essential for you. The less gets compiled as a module the more benefits LTO will give you. Modules will get compiled with LTO, too, but it will only optimise the parts of a module that make the module, but without the full kernel scope. Hence compiling essential parts directly into the kernel allows for more inlining and better optimisations during the final LTO pass.
    Last edited by sdack; 22 June 2021, 07:44 AM.

    Leave a comment:


  • perpetually high
    replied
    Originally posted by sdack View Post
    Forget -O3. Build the kernel with Clang and enable LTO. Simply pass LLVM=1 and LLVM_IAS=1 as arguments to make and you should be able to switch on LTO optimisation [/CODE]

    It is so simple, even Michael should be able to do it.
    Hey dude, I just wanted to say thanks, because I'm compiling the kernel as we speak with full LTO and I'm stoked. Been wanting to try it for a while.

    I switched 2 weeks ago from -O3 -march=native to -O2 -march=native, and it's been great. So I'm excited to see if I can tell a different with clang and LTO.

    Will chime back in afterwards if I notice a difference, but your hot tip was exactly what I needed to get going.

    It took a little work for Ubuntu 20.04.2 understandably since not many people are doing this (lld was 10.0.0, needed 10.0.1, so I went balls out and got lld-12). I also needed libclang-12-dev, and then that was it.

    Thanks again. Cheers

    Leave a comment:


  • indepe
    replied
    Originally posted by Hugh View Post
    This is a case where compiler culture is non-intuitive. And by the time you learn enough to understand it, you are likely absorbed into it and share it.

    Explanation: if the programmer wrote *ptr, he is asserting that that is legal and thus that ptr is not NULL. The compiler, desperate for any advantage, uses/exploits this information.
    Yes, I understood. However when the compiler then encounters the "if (ptr == NULL)", it would (ideally) notice the contradiction, cancel the previous expectation, and issue a warning, if not an error. Instead of now using only half the information and insist on making an optimization based on a hint that has already been shown to be contradicted by a statement of more explicit nature.

    So from my point of view, that is simply a compiler bug that should be fixed instead of given a pass. Even more so if that is "compiler culture". It kind of confirms Linus' complaint about compiler treatment of "undefined behavior". I can't see how I might "share" that "desperate" point of view. It would be a wrong priority.

    Originally posted by Hugh View Post
    I felt the same way as you when I first encountered optimizing compilers (in the late 1960s). I've written tools to use optimizing-like techniques to detect "anomalies" in programs. It's amazing how poorly programmers take nagging. I've been looking at diagnostics from Coverity Scan today and it is tiresome, if useful.
    Maybe I should try out Coverity Scan. In any case, if I wrote code like the above, I certainly would want the compiler to tell me about it.

    Leave a comment:


  • Hugh
    replied
    Originally posted by indepe View Post

    I think in this case the compiler should issue an error or at least a warning about the undefined behavior, instead of using it to optimize.

    (Because the reverse assumption is just as valid: an attempt to dereference a pointer that is NULL in some cases. The code doesn't make any sense in the first place.)

    In other words, it is a solvable problem.
    This is a case where compiler culture is non-intuitive. And by the time you learn enough to understand it, you are likely absorbed into it and share it.

    Explanation: if the programmer wrote *ptr, he is asserting that that is legal and thus that ptr is not NULL. The compiler, desperate for any advantage, uses/exploits this information.

    I felt the same way as you when I first encountered optimizing compilers (in the late 1960s). I've written tools to use optimizing-like techniques to detect "anomalies" in programs. It's amazing how poorly programmers take nagging. I've been looking at diagnostics from Coverity Scan today and it is tiresome, if useful.

    BTW, it is almost impossible to reply due to ads causing my window to jump around.

    Leave a comment:


  • indepe
    replied
    Originally posted by Hugh View Post
    Code:
    junk = *ptr;
    if (ptr == NULL)
    something();
    After optimization, there is no call to something!
    Dereferencing a null pointer is UB. The compiler is allowed to infer that ptr is not null. So the IF can be eliminated!

    This would likely surprise any C coder who didn't already know it.

    I think that GCC stopped exploiting this perfectly legal inference in kernel code because it surprised programmers too much.

    Furthermore, In some kernel code, 0 is a valid address. So things are even more weird. The compiler doesn't know this. It is kind of paradoxical from the standpoint of the C language.
    I think in this case the compiler should issue an error or at least a warning about the undefined behavior, instead of using it to optimize.

    (Because the reverse assumption is just as valid: an attempt to dereference a pointer that is NULL in some cases. The code doesn't make any sense in the first place.)

    In other words, it is a solvable problem.
    Last edited by indepe; 16 June 2021, 04:30 PM.

    Leave a comment:


  • Hugh
    replied
    So much heated whataboutism is this thread. Please stop.

    Originally posted by foobaz View Post

    In my experience as a developer, the most common problem with -O3 is not bad compilers, but bad code. It's not too hard to accidentally write C code that depends on undefined behavior. Such code can work as expected with -O2 but fail under -O3. It can even work fine for years with -O3, until some minor change to seemingly unrelated code tips the optimizer into behaving differently and it breaks. Or updates to the compiler can change optimization behavior and break programs which previously worked but are technically invalid.

    I've seen -O3 bugs with LLVM as well as with GCC. It's easy to say "well just don't write bad code then" but historically this strategy has not been shown to be effective with C programming.
    This is so true.

    Programmers have a model of how C works. Generally, they are wrong. This is mostly harmless until an optimizer uses its discretion to change things using the as-if rule and exploiting Undefined Behaviour.

    A classic example from buggy kernel code, from my dodgy memory:

    Code:
    junk = *ptr;
    if (ptr == NULL)
      something();
    After optimization, there is no call to something!
    Dereferencing a null pointer is UB. The compiler is allowed to infer that ptr is not null. So the IF can be eliminated!

    This would likely surprise any C coder who didn't already know it.

    I think that GCC stopped exploiting this perfectly legal inference in kernel code because it surprised programmers too much.

    Furthermore, In some kernel code, 0 is a valid address. So things are even more weird. The compiler doesn't know this. It is kind of paradoxical from the standpoint of the C language.

    Leave a comment:


  • sdack
    replied
    Originally posted by F.Ultra View Post
    It's not Trumps fault ...
    This is not about a bug and nobody is talking about whose fault it was. Torvalds is not accusing a specific person, but he generalises, rants and talks shit about an entire project. A point was then made here on the forum that he should look at his own project before he talks shit about others, and it is a well-made point. Stop making excuses for Torvalds, stop saying it was not his fault, do not defend him. His behaviour is not acceptable. So do not enable it and do not support it.
    Last edited by sdack; 08 June 2021, 03:50 PM.

    Leave a comment:

Working...
X