No announcement yet.

Linux 5.15's New "-Werror" Behavior Is Causing A Lot Of Pain

  • Filter
  • Time
  • Show
Clear All
new posts

  • #41
    Originally posted by MrCooper View Post
    This is a rare case where Linus was simply wrong.

    Enabling -Werror by default is bad because it causes pain for lots of people, most of which have no responsibility for or even control over the factors which result in compiler warnings.

    The right place for -Werror is in CI which gates merging changes. That way it causes pain only for developers who try to merge changes which result in warnings.
    I agree partially, -Werror before merge should be required true, but if rest of kernel is not warning free, that doesn't help much. Build error will be triggered in out of context code for committer. So for -Werror on CI to work, first the rest of the kernel should be buildable with it. But if kernel is able to build with -Werror by itself there is no point of not enabling it by default.


    • #42
      Originally posted by curfew View Post
      It's time for those $200k-per-year overpaid engineers to get up from their asses and start to do the lifting too.
      While I agree with the rest of the post, not all kernel devs are $200k-per-year overpaid. The short time I got paid to do some kernel coding (not really a lot of it, but whatever) I was making about... $30k-per-year. Third world country and all that, but even for the country it wasn't a big salary.

      Originally posted by partcyborg View Post
      With Linux it's even crazier than that sometimes. For example, I used to build custom Android roms, and simply building the exact same kernel source using a different gcc version would result in a kernel that would refuse to boot.
      And often it's things that were warned about in prior versions that become breaking in new ones, like some undefined behavior, which is acceptable to change over versions. So breaking the build early on those is a good thing.

      Originally posted by lumks View Post
      While I think that it should be enabled by default, I also think there should be a target toolchain infrastructure to proof against. If feels like a crazy move to enable this for everything without to know what downstream builds against. Like in 5.15 builds 5.15.1 builds 5.15.2 wont build because there was a llvm update that now creates a feature deprecation warning. Badumm.
      Well, the article does mention build farms, that's what they're there for AFAICT.

      Originally posted by thedukesd View Post
      Maybe a question for Linus himself is what he prefer:
      1) code that gives no warnings but it's totaly broken with the hardware it supposed to work with (there are already such things in the kernel with no plans to remove them)
      2) code that gives warnings but work just fine with the hardware
      The problem there is that 2 is not generally true. Some warnings lead to breakage after compiler updates, or in specific circumstances. This code that "works just fine with the hardware" does only when the dev developed it, and then breaks without a clue why.

      Originally posted by thedukesd View Post
      While Linus idea is not bad from one point of view there is a bigger problem. If a compiler change is making the code that until now wasn't giving a warning start to give warnings how Linus expect this problem to be treated? Because there are 2 options here:
      1) force the people working at that compiler revert the change that is causing the warnings, after all since now it wasn't the case
      2) force people to fix the code that was so far compiling fine and having to deal with people plain refusing to do it and ending up in a situation where you either drop that stuff from kernel or plain fix it yourself
      1) If the warning is correct, then the code was incorrect at the beginning. Nothing for the compiler to fix, it's being helpful by pointing out your code is broken, even if previous versions couldn't determine it.
      2) Yes. That's what maintainership is for. You don't do code drops, you make a commitment at keeping the code working. If you hit a bug, you report a bug, and the maintainer fixes it, and in that kind of case probably backports to stable releases.

      Originally posted by thedukesd View Post
      Coding something in Linux kernel is like moving sand. A driver coded for kernel 3.18, that is happy running with 3.18, will plain not work in 5.15 and require a lot of patches to actually work properly.
      When a kernel change is making a driver no longer work I don't consider that is the job of the one that coded the driver to fix it, but it's the job of the ones that submited the kernel patches and accepted upstream the kernel patches that are causing this problem because it's actually them who are breaking the compatibility.,..
      I'm not talking here about code not compiling I'm talking here about code that happy compiles fine but it's totaly crashing/misbehaving due to kernel changes done by someone else.
      It's the job of whoever made the commitment. The kernel explicitly makes no commitment at keeping out-of-tree code working. So either you send it upstream and __commit__ at maintaining it, or you're on your own. Anything else is plain entitlement :shrug:
      It's not feasible to expect the kernel community to keep up with all random drivers out of tree. In-tree drivers get fixed with the patch.

      Originally posted by thedukesd View Post
      In windows case I can take a driver from 2012 and happy install it and it will just work, can't really say the same about Linux case. This is a way bigger issue than the fact that some stuff is giving warnings while compiling...
      Maybe next time Linus will decide well to remove everything that is showing warnings in logs despite the fact that the hardware is 100% healthy(won't really do that because you will have a hard time finding a pc that will actually work...)
      Yes. That's most likely because, you know, consumers pay for Windows to meet those expectations. And it's quite costly to do so. Besides, I met situations where that was not the case, like some cheap on-board GPUs, and old scanner, an old video capture card, etc..
      Third party drivers are an undebuggable abomination. If your provider cared in the slightest they'd be upstreamed and you wouldn't not only not have that problem, but it'd work out of the box without installing random blobs.

      Originally posted by thedukesd View Post
      With small amount of supported hardware you can test if changes break something. With lot of hardware you can't, you depend on other people, people that might not have the best intentions, people that might just partialy test it, people that might only want to say in their resume that they have code upstream in the kernel and so on...
      So, you're complaining about drivers that break and at the same time complaining too much hardware is supported? That sounds like "only my setup should be supported, fsck everyone else".


      • #43
        Linus' pussied out for the wrong reason. Another Linux fail. What's new.


        • #44
          Unless you want to break the ability to build previous releases with newer compilers (which have newer warnings), you don't put -Werror in the release build. Who is building previous releases with newer compilers? Users!

          What's more sane is to turn specific warnings into errors: -Werror=format-security and -Werror=switch should be obligatory in my opinion, because you don't need to ever tolerate these warnings (if your code is written to that standard).


          • #45
            Originally posted by briceio View Post
            It isn't an easy question. But warnings are just... warnings. So by definition they shouldn't block anything. I think it would have been better to re-qualify some warnings in errors (some of them... not all). Because like it has been said: compiler changes add/remove warnings and some code written a few years back might need a refresh if, and only if, it generates an error.
            The problem here is that the term "Warning" is wrong and i always thought this since Borland Turbo C days because it makes programmers develop the same train of thought you just had, "Hey, is just a warning and i don't see any obvious crash, Voila it works !!!" when in reality it should be named as it is "Undefined Behavior".

            When a compiler give you a "warning", is telling you "Dude, this seems wrong but with some dark magic i can make it look valid but at runtime well, lets pray it works" and it most cases will work until it won't (crash, security breach, corrupted data, etc.).

            Why compilers did not make -Werror default since day 1 then? Well, in resume this undefined behavior is very fast and is needed for certain operations and can be 100% safe if the developer understand 100% how the hardware will operate at runtime, in which case it should be dutifully commented on the code.


            • #46
              Originally posted by sinepgib View Post

              While I agree with the rest of the post, not all kernel devs are $200k-per-year overpaid. The short time I got paid to do some kernel coding (not really a lot of it, but whatever) I was making about... $30k-per-year. Third world country and all that, but even for the country it wasn't a big salary.
              (Refers to unapproved post)
              Actually, I did the numbers more carefully and it was about $12k-per-year :'(
              (I had to convert from local currency from memory before, now I looked up the value at the time)

              Besides, there are also unpaid kernel devs. Not the majority, but not negligible either.


              • #47
                So Linus made a very smart move, masquerading a warning shot as a straight bomb into everyone's shitholes to make them wake up and clean after their mess.

                Pretty sure topic will boomerang back in around 2 years or so.

                It's as non-diplomatic as a straight right, but developers probably earned it. I really appreciate everyone that contributes to Linux kernel, as I'd be far incapable of doing so, but from my experience writing code is meaningless if it has obvious flaws that one lets live...
                I hope that move will make developers bring up a "clean all my historic mess" task plan in their upcoming months. Keep up the good work guys.


                • #48
                  Originally posted by jrch2k8 View Post
                  "Warning" (…) should be named as it is "Undefined Behavior"

                  safe if the developer understand 100% how the hardware will operate at runtime
                  No and no.

                  Compilers absolutely warn about language-defined behavior too, and it's not even all that useful. Example: -Wsign-compare: It's not like integer promotion rules aren't perfectly defined by the language. I would go as far as to say that when it comes to equality comparisons (where signedness of comparison is not even a thing), this warning is 100% detrimental: There is only one interpretation – the intended one, so there is nothing to whine about, all seasoned programmers know the integer promotion rules anyway, and it is detrimental, because "fixing" the code by adding explicit casts makes it not just less readable but more brittle.

                  In contrast, compilers pretty much never warn about undefined behavior. Rather, it's an optimization opportunity: If the compiler sees undefined behavior somewhere in the code, it has the license to assume that the condition that would make it happen won't occur. That's how branches get deleted. Scary!

                  For that reason, it doesn't help to know how the hardware works: You can't know what you are doing, because the compiler has the license to delete your code. So in contrast to the low-level portable assembly language it tries to be, C is actually a treacherous language that won't faithfully do as the source code says. Yes, there are good warnings that can tell if the code is for sure wrong (such as format-security and unhandled cases in enum switches), but UB is so ingrained that there is no remedy.

                  All that said, I do wish there was an incompatible dialect of C where UB was just forbidden (like Zig/Rust).
                  Last edited by andreano; 08 September 2021, 03:19 PM.


                  • #49
                    Yesterday everyone over joyed with -Werror being the best thing to ever happen to Linux Kernel and anyone who opposed that view does not know anything about coding! Now that it's reverted ... sorry demoted, everyone again starts to see why it wasn't a well-executed idea after all. Typical cult behavior.


                    • #50
                      Short term : bad.

                      Long term : good.

                      Sooner or later it should've been done.