Announcement

**DavidBrown** · 26 February 2021, 08:07 AM

Originally posted by coder View Post

Please explain this in detail. I want to be sure I fully understand what you're proposing.

Compiler Explorer - C++ (x86-64 gcc 9.1)

https://www.godbolt.org/z/95fond

#pragma GCC diagnostic warning "-Wswitch-enum" #pragma GCC diagnostic warning "-Wuninitialized" #pragma GCC diagnostic warning "-Wmaybe-uninitialized" #if 1 typedef enum E { ea, eb, ec } E; #else typedef enum E { ea, eb, ec, ed } E; #endif int test1(E e) { int x; switch (e) { case ea : x = 1; break; case eb : x = 2; break; case ec : x = 3; break; } return x; } int test2(E e) { int x; switch (e) { case ea : x = 1; break; case eb : x = 2; break; case ec : x = 3; break; default : __builtin_unreachable(); } return x; }

Code:

[COLOR=#0000ff]#if[/COLOR][COLOR=#000000] [/COLOR][COLOR=#098658]1[/COLOR]
[COLOR=#0000ff]typedef[/COLOR][COLOR=#000000] [/COLOR][COLOR=#0000ff]enum[/COLOR][COLOR=#000000] E { ea, eb, ec } E;[/COLOR]
[COLOR=#0000ff]#else[/COLOR]
[COLOR=#0000ff]typedef[/COLOR][COLOR=#000000] [/COLOR][COLOR=#0000ff]enum[/COLOR][COLOR=#000000] E { ea, eb, ec, ed } E;[/COLOR]
[COLOR=#0000ff]#endif[/COLOR]

[COLOR=#0000ff]int[/COLOR][COLOR=#000000] test1(E e) {[/COLOR]
[COLOR=#000000]    [/COLOR][COLOR=#0000ff]int[/COLOR][COLOR=#000000] x;[/COLOR]
[COLOR=#000000]    [/COLOR][COLOR=#0000ff]switch[/COLOR][COLOR=#000000] (e) {[/COLOR]
[COLOR=#000000]        [/COLOR][COLOR=#0000ff]case[/COLOR][COLOR=#000000] ea : x = [/COLOR][COLOR=#098658]1[/COLOR][COLOR=#000000]; [/COLOR][COLOR=#0000ff]break[/COLOR][COLOR=#000000];[/COLOR]
[COLOR=#000000]        [/COLOR][COLOR=#0000ff]case[/COLOR][COLOR=#000000] eb : x = [/COLOR][COLOR=#098658]2[/COLOR][COLOR=#000000]; [/COLOR][COLOR=#0000ff]break[/COLOR][COLOR=#000000];[/COLOR]
[COLOR=#000000]        [/COLOR][COLOR=#0000ff]case[/COLOR][COLOR=#000000] ec : x = [/COLOR][COLOR=#098658]3[/COLOR][COLOR=#000000]; [/COLOR][COLOR=#0000ff]break[/COLOR][COLOR=#000000];[/COLOR]
[COLOR=#000000]    }[/COLOR]
[COLOR=#000000]    [/COLOR][COLOR=#0000ff]return[/COLOR][COLOR=#000000] x;[/COLOR]
[COLOR=#000000]}[/COLOR]

[COLOR=#0000ff]int[/COLOR][COLOR=#000000] test2(E e) {[/COLOR]
[COLOR=#000000]    [/COLOR][COLOR=#0000ff]int[/COLOR][COLOR=#000000] x;[/COLOR]
[COLOR=#000000]    [/COLOR][COLOR=#0000ff]switch[/COLOR][COLOR=#000000] (e) {[/COLOR]
[COLOR=#000000]        [/COLOR][COLOR=#0000ff]case[/COLOR][COLOR=#000000] ea : x = [/COLOR][COLOR=#098658]1[/COLOR][COLOR=#000000]; [/COLOR][COLOR=#0000ff]break[/COLOR][COLOR=#000000];[/COLOR]
[COLOR=#000000]        [/COLOR][COLOR=#0000ff]case[/COLOR][COLOR=#000000] eb : x = [/COLOR][COLOR=#098658]2[/COLOR][COLOR=#000000]; [/COLOR][COLOR=#0000ff]break[/COLOR][COLOR=#000000];[/COLOR]
[COLOR=#000000]        [/COLOR][COLOR=#0000ff]case[/COLOR][COLOR=#000000] ec : x = [/COLOR][COLOR=#098658]3[/COLOR][COLOR=#000000]; [/COLOR][COLOR=#0000ff]break[/COLOR][COLOR=#000000];[/COLOR]
[COLOR=#000000]        [/COLOR][COLOR=#0000ff]default[/COLOR][COLOR=#000000] : __builtin_unreachable();[/COLOR]
[COLOR=#000000]    }[/COLOR]
[COLOR=#000000]    [/COLOR][COLOR=#0000ff]return[/COLOR][COLOR=#000000] x;[/COLOR]
[COLOR=#000000]}[/COLOR]

(I made the typedef so that the code will work as C or C++.) With the first version "test1", you get a "-Wmaybe-uninitialized" warning and someone reading the code would agree - it is not clear that the code is save. The generated assembly is:

Code:

[COLOR=#008080]test1(E):[/COLOR]
[COLOR=#000000]        [/COLOR][COLOR=#0000ff]lea[/COLOR][COLOR=#000000]     [/COLOR][COLOR=#4864aa]edx[/COLOR][COLOR=#000000], [[/COLOR][COLOR=#4864aa]rdi[/COLOR][COLOR=#000000]+[/COLOR][COLOR=#098658]1[/COLOR][COLOR=#000000]][/COLOR]
[COLOR=#000000]        [/COLOR][COLOR=#0000ff]cmp[/COLOR][COLOR=#000000]     [/COLOR][COLOR=#4864aa]edi[/COLOR][COLOR=#000000], [/COLOR][COLOR=#098658]3[/COLOR]
[COLOR=#000000]        [/COLOR][COLOR=#0000ff]mov[/COLOR][COLOR=#000000]     [/COLOR][COLOR=#4864aa]eax[/COLOR][COLOR=#000000], [/COLOR][COLOR=#098658]0[/COLOR]
[COLOR=#000000]        [/COLOR][COLOR=#0000ff]cmovb[/COLOR][COLOR=#000000]   [/COLOR][COLOR=#4864aa]eax[/COLOR][COLOR=#000000], [/COLOR][COLOR=#4864aa]edx[/COLOR]
[COLOR=#000000]        [/COLOR][COLOR=#0000ff]ret[/COLOR]

With "test2", there is no warning from the compiler, a programmer reading the code can see that the code assumes other values of "e" cannot occur - it is an explicit and documented assumption, rather than an implicit and hidden one. (It might still be an invalid assumption, of course - that's another matter.) And the generated assembly is:

Code:

[COLOR=#008080]test2(E):[/COLOR]
[COLOR=#000000]        [/COLOR][COLOR=#0000ff]lea[/COLOR][COLOR=#000000]     [/COLOR][COLOR=#4864aa]eax[/COLOR][COLOR=#000000], [[/COLOR][COLOR=#4864aa]rdi[/COLOR][COLOR=#000000]+[/COLOR][COLOR=#098658]1[/COLOR][COLOR=#000000]][/COLOR]
[COLOR=#000000]        [/COLOR][COLOR=#0000ff]ret[/COLOR]

So the compiler takes advantage of the extra information to generate more efficient code.

If you change the conditional to add an extra enumerator to the type, you get warnings (which you want) on both functions.

Yes, I think we're basically in agreement. Your earlier post seemed to take a more absolutist stance, which I clearly reject.

Fair point. It's often easy to write posts that are too generalised and absolutist - the nuances take time and space.

**DavidBrown** · 26 February 2021, 08:09 AM

Originally posted by coder View Post

Point taken, but Boost's optional<> was first released in v1.30, 18 years ago!

https://www.boost.org/users/history/version_1_30_0.html

Some bugs take a long time to fix...

**coder** · 26 February 2021, 08:31 AM

Originally posted by DavidBrown View Post

Some bugs take a long time to fix...

Is it a bug? We've established that -Wmaybe-uninitialized cannot be correct 100% of the time. While I assume gcc's developers would like to improve its static analysis, I'm skeptical they would regard all of the false-positives involving boost:: optional as actual bugs.

**DavidBrown** · 26 February 2021, 10:51 AM

Originally posted by coder View Post

Is it a bug? We've established that -Wmaybe-uninitialized cannot be correct 100% of the time. While I assume gcc's developers would like to improve its static analysis, I'm skeptical they would regard all of the false-positives involving boost:: optional as actual bugs.

I was thinking more of it being a bug in boost, not gcc. But you are correct that it might not be viewed as a bug - more a limitation of the language. There is no good way in C or C++ to have a struct with two fields and say "this field has been given a value, but that field is uninitialised". Without a lot of effort looking at the history of the boost optional implementation alongside the changes in gcc (and that's more time that I want to spend!), it's all just guesswork.

**arQon** · 27 February 2021, 06:51 AM

Originally posted by coder View Post

It's not mutually-exclusive, is it? I guarantee there are cases where -Wuninitialized will give a false-negative! Why not wear a belt and suspenders?

Because this is not only fundamentally dishonest, it creates new problems of its own. :/

Many years ago, MSVC used to do exactly this for debug builds. (At least, effectively so: I think the actual mechanic was a side-effect of stack clears/allocations, but, meh). Which meant that bugs caused by it would be "fixed" in debug builds, so they'd never present if yo uran the code in the debugger in an attempt to find out what the hell was going on. Spectacularly unhelpful.

If you give clowns a tool like this, they WILL use it instead of fixing the code. And if that code is a library that depends on this behavior because the author lacked the skill or effort to actually fix things, now you've got a timebomb library that has to be built this way or it'll fail. It's a spiral that leads to random failures at best.

It's just a terrible idea, and the bugs that -Wuninitialized runs into doesn't change that: that's basically just a "two wrongs make a right" argument, and they very rarely actually do.

**coder** · 27 February 2021, 06:53 AM

Originally posted by DavidBrown View Post

With "test2", there is no warning from the compiler, a programmer reading the code can see that the code assumes other values of "e" cannot occur - it is an explicit and documented assumption, rather than an implicit and hidden one. (It might still be an invalid assumption, of course - that's another matter.)
...
So the compiler takes advantage of the extra information to generate more efficient code.

Thank you for walking through that.

I was not particularly familiar with __builtin_unreachable() and didn't consider the full range of its potential benefits.

In code where an enum value arrives from an external source (including the other side of a library interface), I do range-check it, to ensure that I don't need a default case. However, I was unaware of the benefits of providing a declared-unreachable default.

**coder** · 27 February 2021, 08:12 AM

Originally posted by arQon View Post

Because this is not only fundamentally dishonest, it creates new problems of its own. :/

Many years ago, MSVC used to do exactly this for debug builds. (At least, effectively so: I think the actual mechanic was a side-effect of stack clears/allocations, but, meh). Which meant that bugs caused by it would be "fixed" in debug builds, so they'd never present if yo uran the code in the debugger in an attempt to find out what the hell was going on. Spectacularly unhelpful.

That's exactly why I want the compiler to emit debug info to distinguish these automatic initializations from formal ones -- so they can be detected & distinguished at runtime!

Originally posted by arQon View Post

If you give clowns a tool like this, they WILL use it instead of fixing the code. And if that code is a library that depends on this behavior because the author lacked the skill or effort to actually fix things, now you've got a timebomb library that has to be built this way or it'll fail. It's a spiral that leads to random failures at best.

The bug exists, regardless. You're just hoping that by not zeroing it, it's more likely to cause noticeable problems. However, what would be even better than that is runtime instrumentation to ensure the problem can be detected at runtime, either through use of tools like valgrind or through the use of the compiler's own generated sanitizers.

Furthermore, if you insert a pattern instead of zeros, then it should actually be easier to trip over, in your scenario.

Originally posted by arQon View Post

It's just a terrible idea, and the bugs that -Wuninitialized runs into doesn't change that: that's basically just a "two wrongs make a right" argument, and they very rarely actually do.

As long as it doesn't replace proper testing (including code introspection, mentioned above) and debugging, the result should be safer, more deterministic code. Yet, you would prevent anyone from having that, for fear of it being abused? It's like arguing not to put airbags in cars, because a few people might use them as an excuse not to wear a seat belt. You're preventing an actual good for many, on the basis of potential harm experienced by an irresponsible few. To me, that doesn't add up.

**DavidBrown** · 27 February 2021, 08:53 AM

Originally posted by coder View Post

Thank you for walking through that.

I was not particularly familiar with __builtin_unreachable() and didn't consider the full range of its potential benefits.

In code where an enum value arrives from an external source (including the other side of a library interface), I do range-check it, to ensure that I don't need a default case. However, I was unaware of the benefits of providing a declared-unreachable default.

Yes, it is always important to check values coming in from outside! For values passed around to functions in your code, it's not always easy to judge the level of checking appropriate - do you assume that whoever is calling your function is doing so correctly and give the most efficient results, or do you assume they might make a mistake and give them the most help for debugging? (Often the answer is to have macros or conditionals of some sort to let the developer choose, or perhaps provide wrappers so that "foo" checks its arguments, then calls "foo_checked" which is the efficient implementation that assumes the arguments are valid.)

__builtin_unreachable() is a tool I find useful in code as a way of giving the compiler extra information. If you have an int parameter, which you know is between 1 and 10 and you know that might let the compiler generate better code, you can write:

Code:

if ((i < 1) || (i > 10)) __builtin_unreachable();

I use a macro wrapper like this:

Code:

extern void __attribute__((error("Assume failed"))) assumeFailed(void);
#define assume(x) \
    do { \
        if (__builtin_constant_p(x)) { \
            if (!(x)) assumeFailed(); \
        } \
        if (!x) __builtin_unreachable(); \
    } while (0)

Thus when you write "assume((i >= 1) && (i <= 10));", the compiler can use that information for optimisation and other checking. And if it can figure out that "i" is definitely not in the correct range, then you get a compile-time error. This also clearly documents the assumptions to anyone reading the code, as a part of the program rather than just a comment.

**arQon** · 27 February 2021, 08:22 PM

(note: I've been busy, so I haven't read the last few pages of comments yet).

Originally posted by coder View Post

The bug exists, regardless.

I agree all your points are fair, but I think we're looking at things differently: ISTM that you see this as a technical problem that this hack helps chip away at. But I see it as an atttiude problem, and this hack as something that only further encourages negative behavior that's already drowning our industry in garbage.
I've dealt with more than enough "But if it works, what's the problem?" over the decades for this to raise warning flags for me. Whether it's unskilled "consultant" drones or non-technical managers, a refusal to face reality and address problems only ever leads to worse systems.

> That's exactly why I want the compiler to emit debug info to distinguish these automatic initializations from formal ones -- so they can be detected & distinguished at runtime!

Runtime *after deployment* is too late. We already HAVE a mechanism to detect these cases: it's "Use -Wunitialized and don't let your code go out the door with this crap in it in the first place!". :P

(And I agree, valgrind/etc is often a good "how" for detecting these cases. But I'd much rather have it happen at compile time, and you know you do too. :P)

> Furthermore, if you insert a pattern instead of zeros, then it should actually be easier to trip over, in your scenario.

It usually is, yes: and more recent versions of VC etc have now adopted that approach rather than zeroing the stack. (I suspect the zeroing was because most of the problem cases THEY experienced were uninitialized pointers, and the idea was that nulled ones would cause an insant segfault).
But "usually" is still just "let's hope we get lucky"

, and it really comes down to the specific code. There's ALWAYS a case where random hacks can look good in dev and even in a beta on thousands of machines, but go horribly wrong in prod and/or on millions instead.

> As long as it doesn't replace proper testing

And there's the rub. It absolutely *will* do exactly that, far more often than not.

> It's like arguing not to put airbags in cars, because a few people might use them as an excuse not to wear a seat belt. You're preventing an actual good for many, on the basis of potential harm experienced by an irresponsible few. To me, that doesn't add up.

Because your strawman, though not malicious, is not how I see it. The "irresponsible few" are the vast majority. Just because you aren't one of them, and you find the whole idea of "developers" like that disgusting, you think they're a tiny minority. 10 years ago, I'd probably have just about agreed, though only because I'm an optimist at heart; and 20 years ago I definitely would have; but today I think you're being beyond optimistic. I don't see your "actual good for MANY" at all in this. I'm not even sure I see an "actual good AT ALL" that isn't already (better) covered by the tools we already have. But maybe I will once I get caught up on those last few pages.

**coder** · 28 February 2021, 04:39 AM

Originally posted by arQon View Post

I've dealt with more than enough "But if it works, what's the problem?" over the decades for this to raise warning flags for me. Whether it's unskilled "consultant" drones or non-technical managers, a refusal to face reality and address problems only ever leads to worse systems.

I know this only too well.

Originally posted by arQon View Post

Runtime *after deployment* is too late. We already HAVE a mechanism to detect these cases: it's "Use -Wunitialized and don't let your code go out the door with this crap in it in the first place!". :P

Most code gets run at least once, before deployment. And what the tools should be doing is 2-pronged:

Helping people find these problems, when they happen.
Mitigating the impact of such bugs on hapless end-users.

Towards the first point, yes, we should have compile-time warnings. However, as I've repeatedly said, -Wuninitialized simply cannot catch 100% of these errors, even in the best feasible compiler implementation. So, there will always be a need for runtime detection, whether using compiler-generated sanitizers or external tools (like valgrind).

And if you can try to get outside your developer mindset for just a moment and put yourself in the shoes of an end user, would you rather have your software compiled with these zero-initialized defaults or not? Those are the only two choices you have. I know which I'd pick. I wish all software were perfect, but I know it's not. Given that, it seems like an easy way to mitigate most uninitialized variable bugs and generally make things behave (and fail, importantly!) in a more deterministic fashion.

Originally posted by arQon View Post

> As long as it doesn't replace proper testing

And there's the rub. It absolutely *will* do exactly that, far more often than not.

I think you give bad devs too much credit. Most are too lazy and n00b to know their compiler even offers such options. As long as it's not the default, I think you're exaggerating the downside. I see distro maintainers as the main users of such an option.

Originally posted by arQon View Post

10 years ago, I'd probably have just about agreed, though only because I'm an optimist at heart; and 20 years ago I definitely would have; but today I think you're being beyond optimistic.

Industry standards will catch up. As waves of kids flood into the IT workforce, I think a lot of the bad apples will get weeded out and will see the kind of professionalization that most skilled and specialized labor undergoes.

Announcement

Proposed GCC 12 Security Option Would Auto Initialize Automatic Variables

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment