Announcement

**indepe** · 11 September 2021, 12:27 AM

Originally posted by oiaohm View Post

The bad word in the second bit is corresponding.

No, the word "corresponding" actually appears quite well defined for the purposes of this discussion. Both for "standard integer types", and for those defined in <stdint.h>, like int64_t and uint64_t. Although otherwise in total the rule definitions are quite complicated.

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf

In 6.2.5, referring to "standard integer types", it says this:
"For each of the signed integer types, there is a corresponding (but different) unsigned
integer type (designated with the keyword unsigned) that uses the same amount of
storage (including sign information) and has the same alignment requirements."

In 7.20, "Integer types<stdint.h>", which includes int64_t and uint64_t :
"When typedef names differing only in the absence or presence of the initial u are defined,
they shall denote corresponding signed and unsigned types as described in 6.2.5; an
implementation providing one of these corresponding types shall also provide the other."

Originally posted by oiaohm View Post

These that you used are type defines not in fact covered by the C standard ranking clearly so the compiler could have shoved them into a stupid location that can be dependant on what order they were declared in the header files. Wrong placement int64_t could have end up with a higher rank than uint64_t. How do you declare in C that uint64_t is the corresponding type to int64_t that right you don't. Yes the C standard does not clearly define how typedef are in fact added to the rank system so welcome to another bit of undefined behaviour where the compiler is valid todo anything to you. Yes the fact you used uint64_t int64_t brings fun as these are technically extended integer type.

Apparently you are not aware of chapter 7.20, which as quoted above clearly says that int64_t and uint64_t are corresponding types, since they are both defined in <stdint.h> and differ only in the initial "u".

**tchiwam** · 11 September 2021, 09:11 AM

who here is using -Werror -Wall -pedantic ?

or is it even a thing now a day ?

**oiaohm** · 11 September 2021, 10:34 PM

Originally posted by indepe View Post

Apparently you are not aware of chapter 7.20, which as quoted above clearly says that int64_t and uint64_t are corresponding types, since they are both defined in <stdint.h> and differ only in the initial "u".

My problem is I was still thinking C89. Where those are not defined in chapter 7.20. Yes some compilers still on the support list by the Linux kernel some of them don't support C11 but you will have those values still defined in stdint.h by the C library. Yes fun C library being C11 and the compiler not.

This is another problem what is a defined corresponding type in C11 is not always a defined corresponding type in C89. Of course people writing code depending on corresponding type never code in a check to fail if they are not on the right C version. There is a reason why the C standard include "__STDC_VERSION__" macro value.

indepe this is another problem. Rank assignments in gcc can change with standard version gcc is told to be as well. Sorry I forgot to mention this problem with corresponding types as well.

Rank assignments is mess. Corresponding type assignments depends on your current C of your compiler(yes this changes with gcc when you set versions). And we have compilers that screw up their rank system completely and do the undefined behaviour of coverting unsigned to signed when you fail to case the conversion yourself.

The reality here a unsigned to signed conversion under C is a undefined behaviour that the compiler has the right to totally error out on. Yes signed to unsigned is defined by the standard of C the reverse is not. Rank assignments are a mess so something that looks right can end up running straight into undefined behaviour.

Code:

signed int si = -1;
uint32_t ui = ~0;
if(si == ui)
if(si != ui)

This code here is still possible error with standard. Because signed int is not the corresponding type uint32_t so signed int might be a high rank than uint32_t resulting in conversion in the wrong direction. Remember the wrong direction conversion by the C standard the code is allowed to crash or not build.

It would be so many times simpler if the C standard just said conversion from unsigned to signed is totally forbid for cases where the value cannot remain positive in all cases that way compilers could not make the wrong conversion based on rank assignment. Yes unit32_t being 32 bit wide and signed int si being 32 bit wide with the rule that the conversion could not make a negative value conversion from unt32_t to signed int would be a break of rule. Yes this would cure a lot of problem.

Lot of ways the way to fix C undefined behaviour problem is to make all the undefined behaviour = build error this is perfectly to C standard if you choose to-do this because it undefined behaviour that the C standard allows you to define how ever you like including failure to build.

This with the Linux kernel wanting to turn -Werror on by default is fairly much lets get rid of as much undefined behaviour as possible. Majority of the C standard undefined behaviour with Gcc with all warnings on will generate warnings that then come errors.

**indepe** · 14 September 2021, 04:20 AM

Originally posted by oiaohm View Post

My problem is I was still thinking C89. Where those are not defined in chapter 7.20. Yes some compilers still on the support list by the Linux kernel some of them don't support C11 but you will have those values still defined in stdint.h by the C library. Yes fun C library being C11 and the compiler not.

This is another problem what is a defined corresponding type in C11 is not always a defined corresponding type in C89. Of course people writing code depending on corresponding type never code in a check to fail if they are not on the right C version. There is a reason why the C standard include "__STDC_VERSION__" macro value.

The new phoronix article mentioning that the kernel raised its gcc version requirement but kept C89, finally triggered my curiosity: I looked up the C89 definition for "usual arithmetic conversions" :

The C89 Draft

http://port70.net/~nsz/c/c89/c89-draft.html

3.2.1.5 Usual arithmetic conversions

Many binary operators that expect operands of arithmetic type cause conversions and yield result types in a similar way. The purpose is to yield a common type, which is also the type of the result. This pattern is called the usual arithmetic conversions: First, if either operand has type long double, the other operand is converted to long double . Otherwise, if either operand has type double, the other operand is converted to double. Otherwise, if either operand has type float, the other operand is converted to float. Otherwise, the integral promotions are performed on both operands. Then the following rules are applied: If either operand has type unsigned long int, the other operand is converted to unsigned long int. Otherwise, if one operand has type long int and the other has type unsigned int, if a long int can represent all values of an unsigned int, the operand of type unsigned int is converted to long int ; if a long int cannot represent all the values of an unsigned int, both operands are converted to unsigned long int. Otherwise, if either operand has type long int, the other operand is converted to long int. Otherwise, if either operand has type unsigned int, the other operand is converted to unsigned int. Otherwise, both operands have type int.
The values of operands and of the results of expressions may be represented in greater precision and range than that required by the type; the types are not changed thereby.

This would seem to work fine (or not fine, but defined) for comparing unsigned and signed ints of same size for equality, except it would need to be appended likewise for long-long-int on 32 bit systems. However I see indications that GCC doesn't support long-long-int in pre-C90 mode, and in that case, using uint64_t probably fails in the first place and a typecast doesn't help either.

EDIT: Or maybe it does if you use "__extension__"...

**oiaohm** · 14 September 2021, 01:00 PM

Originally posted by indepe View Post

This would seem to work fine (or not fine, but defined) for comparing unsigned and signed ints of same size for equality, except it would need to be appended likewise for long-long-int on 32 bit systems. However I see indications that GCC doesn't support long-long-int in pre-C90 mode, and in that case, using uint64_t probably fails in the first place and a typecast doesn't help either.

EDIT: Or maybe it does if you use "__extension__"...

Merge branch 'gcc-min-version-5.1' (make gcc-5.1 the minimum version) - kernel/git/torvalds/linux.git - Linux kernel source tree

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=316346243be6df12799c0b64b788e06bad97c30b

You got to love gcc fibbing to heck documentation. Gcc 4.9 that Linux kernel did move from technically support up to 128bit on 32 bit systems in pre C90 being gnu89 mode. one problem its insane. Pre C90 that extra types like long long is not in the C specificaitons so gcc 4.9 and before developers went woo who we can purely go undefined behaviour route and do what ever we like.

Yes the patch updating the Linux kernel from 4.9 to 5.1 removes a lot of check of gcc 4.9 and do something special for where the compiler does what ever it down right likes because its not in the C specification.

Its true that the gcc documentation tells you that long long int is not supported when not in C90 mode when the horrible truth its support in gnuc89 mode with total insanity of doing what ever it likes include thing that technically break 3.2.1.5 Usual arithmetic conversions of C89 because it does not count because long long int is undefined in C89.

Undefined behaviour with the C standard is a ass. Yes the extra clause in C11 for signed to unsigned being undefined behaviour was added as way to cover up for prior C compilers breaking standard. Yes the C89 wording as you would have noticed has no allowance for doing a unsigned to signed without a cast also has no define if you cast unsigned to signed and it too large fit in the positive side of the signed.

Yes C89 does not have all the rank conversion crap. C11 has made the rules way too complex and flawed.

indepe yes the C89 standard does not have a section covering typecast properly. To be correct its the typedef type casting stuff the C89 and C11 does not state clearly that the rules of signed and unsigned conversion has to be enforced on typedef declares of types. Of course since it not stated clearly the person implementing a C complier is free to go off into own idea. So yes undefined behaviour. Remember C11 does clearly state compiler converting from unsigned to signed is allowed todo what ever they like.

uint64_t being a typedef (sorry I was just writing typecast) it is in undefined behaviour in older compilers even C11 does not have proper rules for handling typecast instead you end up in areas where compilers can do what ever they like.

We need a C standard with less loop holes. Yes C compilers that make doing anything that is undefined in C standard they are set to a error is annoying I will give developers that. But from my personal option compilers that do make anything not in the C standard a error long term would do as good as it might finally put pressure to get a C standard without loop holes. So that a conforming compiler cannot do as many stupid things.

Lot of C problems in fact trace to the standard documents giving compiler developers way too much wiggle room.

C can be made a lot better than what it is.

**indepe** · 14 September 2021, 05:43 PM

Originally posted by oiaohm View Post

that technically break 3.2.1.5 Usual arithmetic conversions of C89 because it does not count because long long int is undefined in C89.

Sure, strictly "technically" in terms of the standard itself, but I think you are making it too complicated. When the compiler extension is needed (for 64 bit ints on 32 bit systems) you can expect the compiler to also extend the "definition" to include "long long int" in the same way as "long int". Especially since later compilers support C90 and then C11 at the same time, so to speak. That would be a really weird compiler extension if it supported "long long" and yet at the same time it handled the arithmetic conversion as "undefined" in order to apply behavior-changing optimizations. You could argue that the compiler might be allowed to even use multiplication instead of addition, resulting in a complete disaster when using a 64 bit type on 32 bit systems.

And now we are just talking about the special case of 64-bit ints on 32 bit systems anyway, and you now seem to agree that the standard does clearly resolve the other corresponding cases in favor of "unsigned". Of course, as stated, I'm all for using typecasts anyway.

Originally posted by oiaohm View Post

indepe yes the C89 standard does not have a section covering typecast properly. To be correct its the typedef type casting stuff the C89 and C11 does not state clearly that the rules of signed and unsigned conversion has to be enforced on typedef declares of types. Of course since it not stated clearly the person implementing a C complier is free to go off into own idea. So yes undefined behaviour. Remember C11 does clearly state compiler converting from unsigned to signed is allowed todo what ever they like.

I can't agree on this one at all, as I don't see why you would think a special rule is needed for "typedef" -ed integer types (really meaning "typedef", not "typecast"). Why would not the same rules apply as to the original type, unless otherwise stated? Anything else would be extremely weird.

Above C89 ocument says: "A typedef declaration does not introduce a new type, only a synonym for the type so specified.".

Although wikipedia wants to be more specific (without clarifying which C version this applies to):
"typedef is a reserved keyword in the programming languages C and C++. It is used to create an additional name (alias) for another data type, but does not create a new type,[1] except in the obscure case of a qualified typedef of an array type where the typedef qualifiers are transferred to the array element type.[2] As such, it is often used to simplify the syntax of declaring complex data structures consisting of struct and union types, but is just as common in providing specific descriptive type names for integer data types of varying lengths."

Originally posted by oiaohm View Post

C can be made a lot better than what it is.

That much I agree with.

**NobodyXu** · 14 September 2021, 09:34 PM

Originally posted by oiaohm View Post

My problem is I was still thinking C89. Where those are not defined in chapter 7.20. Yes some compilers still on the support list by the Linux kernel some of them don't support C11 but you will have those values still defined in stdint.h by the C library. Yes fun C library being C11 and the compiler not.

This is another problem what is a defined corresponding type in C11 is not always a defined corresponding type in C89. Of course people writing code depending on corresponding type never code in a check to fail if they are not on the right C version. There is a reason why the C standard include "__STDC_VERSION__" macro value.

indepe this is another problem. Rank assignments in gcc can change with standard version gcc is told to be as well. Sorry I forgot to mention this problem with corresponding types as well.

Rank assignments is mess. Corresponding type assignments depends on your current C of your compiler(yes this changes with gcc when you set versions). And we have compilers that screw up their rank system completely and do the undefined behaviour of coverting unsigned to signed when you fail to case the conversion yourself.

The reality here a unsigned to signed conversion under C is a undefined behaviour that the compiler has the right to totally error out on. Yes signed to unsigned is defined by the standard of C the reverse is not. Rank assignments are a mess so something that looks right can end up running straight into undefined behaviour.

Code:

signed int si = -1;
uint32_t ui = ~0;
if(si == ui)
if(si != ui)

This code here is still possible error with standard. Because signed int is not the corresponding type uint32_t so signed int might be a high rank than uint32_t resulting in conversion in the wrong direction. Remember the wrong direction conversion by the C standard the code is allowed to crash or not build.

It would be so many times simpler if the C standard just said conversion from unsigned to signed is totally forbid for cases where the value cannot remain positive in all cases that way compilers could not make the wrong conversion based on rank assignment. Yes unit32_t being 32 bit wide and signed int si being 32 bit wide with the rule that the conversion could not make a negative value conversion from unt32_t to signed int would be a break of rule. Yes this would cure a lot of problem.

Lot of ways the way to fix C undefined behaviour problem is to make all the undefined behaviour = build error this is perfectly to C standard if you choose to-do this because it undefined behaviour that the C standard allows you to define how ever you like including failure to build.

This with the Linux kernel wanting to turn -Werror on by default is fairly much lets get rid of as much undefined behaviour as possible. Majority of the C standard undefined behaviour with Gcc with all warnings on will generate warnings that then come errors.

After reading your post, I had the impression that the C/C++ committee always deliver languages with flaws.

I know integer conversion has always been a pain in both languages, but I do not know that it was that complicated.

Same for C++, except you also have to deal with user defined conversion, automatic call to ctor, initializer list (the worst design), overloading, template and etc, as if the problem isn’t worse enough.

I just wish they can introduce more verbosity, like Rust, which forbidden implicit conversion for all integers/struct/enum except for the pointer.

**indepe** · 14 September 2021, 10:18 PM

Originally posted by NobodyXu View Post

I just wish they can introduce more verbosity, like Rust, which forbidden implicit conversion for all integers/struct/enum except for the pointer.

That's what the -Werror and/or the "no warnings" policy are for. For me that pretty much solves it, and I often also add explicit asserts (like assert(i >= 0), or (assert(u <= INT32_MAX)).

If you want to go a step further, you might write macros or template functions that do the type conversion and a range check in one step (for cases that are not extremely performance relevant).

**oiaohm** · 14 September 2021, 10:20 PM

Originally posted by indepe View Post

Sure, strictly "technically" in terms of the standard itself, but I think you are making it too complicated. When the compiler extension is needed (for 64 bit ints on 32 bit systems) you can expect the compiler to also extend the "definition" to include "long long int" in the same way as "long int". Especially since later compilers support C90 and then C11 at the same time, so to speak. That would be a really weird compiler extension if it supported "long long" and yet at the same time it handled the arithmetic conversion as "undefined" in order to apply behavior-changing optimizations.

I wish it was weird compiler extension or some odd ball compiler but we are here in fact talking about gcc 4.9 and before in GNUC89 mode. What is the horrible default. Linus Torvard calling gcc a garbage compiler at different times is more than justified by some of the insanity of it implementation.

And now we are just talking about the special case of 64-bit ints on 32 bit systems anyway, and you now seem to agree that the standard does clearly resolve the other corresponding cases in favor of "unsigned". Of course, as stated, I'm all for using typecasts anyway.

Originally posted by indepe View Post

I can't agree on this one at all, as I don't see why you would think a special rule is needed for "typedef" -ed integer types (really meaning "typedef", not "typecast"). Why would not the same rules apply as to the original type, unless otherwise stated? Anything else would be extremely weird.

This goes back to what is in C11 documentation where user defined type synoyms ie typedef are to be lower rank in conversions than the default C standard defined types. Yes this kind of rank conversion crud appears between C89 and C11 standards this is why it starts appearing in gnuc89 that is GNU compilers extend C89 standard.

Originally posted by indepe View Post

Above C89 ocument says: "A typedef declaration does not introduce a new type, only a synonym for the type so specified.".

So this is right a pure C89 compiler is not allowed rank conversion. But a complier that in the mixed area between C89 and C11 is allowed rank conversion and very badly broken rank conversion.

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf

No two signed integer types shall have the same rank, even if they have the same representation.
and
The rank of any unsigned integer type shall equal the rank of the corresponding signed integer type, if any.
Rules in C11 are down right cursed. In fact they first appear in C90 draft.
The no two signed integer types shall have the same rank even if they have the same representation nukes that old C89 synonym rule. The synonym rule was in fact sane.. I cannot remember the exact arguement for this rank change but it was something todo with cpu designs and the different mathematical processing options.

Yes C89 allowed signed integers types that have same representation stacked with each other and that disappears with C90 and newer. Yes typedef in it name says type define as in create new type that is what makes first rule highlighted here apply and that then kicks on to the second comes a problem. The second line could be fairly simply fixed.
The rank of any unsigned integer type shall equal "or higher" rank of "all" corresponding signed integer type, if any.
That rule change would make would make a rank pattern where unsigned is mixed with signed impossible. So you would have all signed with each other and all unsigned with each other of the same representation. But this is not the C11 standard we have.

Yes same with solid define that you cannot at all convert negative value signed to unsigned would have also prevent problem. Yes define that negative value signed to unsigned the compiler can do what ever it likes including not build into the standard was really not above board for a dependable standard.

Do remember that C11 goes step deep into hell saying that compiler provided types have to be higher rank than typedef ones. The complete rank system in C90 and latter is cursed. Yes way to avoid the cursed is use cast a lot more.

Originally posted by indepe View Post

Although wikipedia wants to be more specific (without clarifying which C version this applies to):
"typedef is a reserved keyword in the programming languages C and C++. It is used to create an additional name (alias) for another data type, but does not create a new type,[1] except in the obscure case of a qualified typedef of an array type where the typedef qualifiers are transferred to the array element type.[2] As such, it is often used to simplify the syntax of declaring complex data structures consisting of struct and union types, but is just as common in providing specific descriptive type names for integer data types of varying lengths."

That a C89 and older description. C90 and newer it does not match and the cause of not matching is the instruction of the ranked conversion rules. GNUC89 mode of gcc compiler cursed because its a halfway house between C89 and C11. C11 is it own version of cursed then only some of its rules in gnuc89 comes even more cursed. Really I wish that Wikipedia description was the only description of typedef.

**oiaohm** · 14 September 2021, 10:36 PM

Originally posted by NobodyXu View Post

I just wish they can introduce more verbosity, like Rust, which forbidden implicit conversion for all integers/struct/enum except for the pointer.

The reality here is a lot of C compilers with all warning on and -Werror or equal enable do exactly exactly that. Some of this we would not need compiler flags if the C language documentation was written better.

Also we have people complaining when gcc -Wsign-compare throws error over (signed==unsigned) this error is purely because of the standard flaw in C90 and newer where that may not go in a sane direction. In fact can go in a direction where the compiler by C90-C11 standards as the right to refuse to build or error at runtime.

Yes there are a lot gcc and llvm warning compiler flags that people turn off because they think they are wrong and complain that leaving them on makes their code less readable/messy.
((unsigned)signed==unsigned) yes I do agree this is more messy than (signed==unsigned) but the more messy one is the one we are meant to use due to the defective C standards so we are 100 percent sure not to go into undefined behaviour that can result in a stack of different errors.

Rust does have advantage here that it language define is not as quirky. Of course that does not mean C cannot be fixed. -Wall -Werror is a good way of moving in the C code in fixed direction by simply making the undefined C miss behaviours unbuild-able.

Announcement

Linux 5.15's New "-Werror" Behavior Is Causing A Lot Of Pain

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment