Announcement

**duby229** · 28 March 2017, 04:29 PM

Originally posted by indepe View Post

Past tests here on Phoronix have shown a variety of results, but most of them show O2 and O3 either at the same speed or O3 noticeably faster. In my time on this forum I haven't read that Michael encountered any problems with -O3, whereas he did encounter problems with a lot of other things. In my short time with Linux I also encountered problems, but using O3 wasn't one of them. I still don't think that it is a problem that goes beyond some specific OS packages within Linux. I've heard some of them don't build with clang either, and I guess that's because they are doing weird things and depending on special compiler features (not surprising for operating system code). That doesn't mean there is anything wrong with using clang, either.

One of Intels engineers who posts here occasionally, has given me the impression that O3 is a meaningful part of their optimizations. He didn't directly say so, but described that they spend some effort identifying the packages that benefit from O3.

I fully agree with you here. Identifying packages one at a time that would benefit from -O3 is probably their only option. Unless they have internally developed some method to identify undefined behaviour, they are probably doing it by building and running dependencies and then using benchmarks and test scenarios to "feel" if it's right. (edit: but "feeling" isn't the same thing as knowing for certain.)

Originally posted by indepe View Post

The compiler itself is behaving in an "undefined" way ?

I suppose that when GCC is itself compiled with -O3, then yeah that is possible. That is by my guess the most likely scenario why cj.wijtmans was experiencing compile failures.

**indepe** · 29 March 2017, 01:50 AM

Originally posted by duby229 View Post

I fully agree with you here. Identifying packages one at a time that would benefit from -O3 is probably their only option. Unless they have internally developed some method to identify undefined behaviour, they are probably doing it by building and running dependencies and then using benchmarks and test scenarios to "feel" if it's right. (edit: but "feeling" isn't the same thing as knowing for certain.)

I believe Clear Linux is actually setting CFLAGS and CXXFLAGS to include -O3 as default for the user. (I'll know soon.)

Originally posted by duby229 View Post

I suppose that when GCC is itself compiled with -O3, then yeah that is possible. That is by my guess the most likely scenario why cj.wijtmans was experiencing compile failures.

So not in your case. Is it a long time ago and you don't remember exactly what happened? Which version did you use?

**cj.wijtmans** · 29 March 2017, 11:42 AM

Originally posted by indepe View Post

Past tests here on Phoronix have shown a variety of results, but most of them show O2 and O3 either at the same speed or O3 noticeably faster. In my time on this forum I haven't read that Michael encountered any problems with -O3, whereas he did encounter problems with a lot of other things. In my short time with Linux I also encountered problems, but using O3 wasn't one of them. I still don't think that it is a problem that goes beyond some specific OS packages within Linux. I've heard some of them don't build with clang either, and I guess that's because they are doing weird things and depending on special compiler features (not surprising for operating system code). That doesn't mean there is anything wrong with using clang, either.

One of Intels engineers who posts here occasionally, has given me the impression that O3 is a meaningful part of their optimizations. He didn't directly say so, but described that they spend some effort identifying the packages that benefit from O3.

Micheal is using ubuntu. Not gentoo where the system is compiled with -O3. Also clearlinux has 3 profiles for compiling with different flags.

**indepe** · 29 March 2017, 01:21 PM

Originally posted by cj.wijtmans View Post

Micheal is using ubuntu. Not gentoo where the system is compiled with -O3. Also clearlinux has 3 profiles for compiling with different flags.

I have already said that Clear Linux does not compile all system packages with -O3.

Otherwise, there is no point in discussing this with you, if, for example, you are not willing to say what compiler error or other "don't compile" problem you are getting.

**s_j_newbury** · 30 March 2017, 03:45 AM

Originally posted by rob11311 View Post

Unfortunately that's not really been the case in practice. It actually can be really hard to avoid writing code which has no potential for undefined behaviour in C, and it can be very hard to spot erroneous code (hint: accessing an array with an unsigned rather than int variable avoids one such case). An older article relating to the kernel when santiser first came out discusses the kind of common issues https://lwn.net/Articles/575563/

This is very true, and it's why you should always avoid implementing undefined behaviour when coding in C. The compiler is free to do whatever it wants with such cases, and it's quite typical to see variant behaviour between different optimisation levels, versions and especially compiler vendors. The fault is very much the programmer, not the compiler.

Things are getting better with improvided compiler warnings, but there's a huge amount of legacy code that's not particularly well written, does not have test suites and the reward/risk ratio for changing it mean cargo cult advice like "use -O2" has been around since forever. Then just look at the benchmarks, theoretically superior options like -O3 and -march=native tend not to deliver significant benefits. The same effect meant that programs built for an AMD64 with 32bit pointers ABI were often slower than 32bit or 64bit despite the wastage of registers or memory.

This is just untrue. x32-abi either makes no difference to performance or greatly enhances it. It almost invariably improves binary size and resident memory usage. The obvious limitation with x32-abi is the inability to efficiently handle large data sets or many mmaped files to due to the smaller address space, which is what 64 bit is for in the first place!

Edit: Also meant to say: -march=native *can* and does sometimes make a huge difference, but many projects explicitly build code with run-time cpu detection which can actually interfere with optimal compile-time code generation, you also get compiler bugs and corner cases where non-optimal code is generated; there's always the chance the generic code is faster on your CPU.

Whilst the real world is annoying and I have shared your frustration, I have seen almost NO progress in 25 years on this, so I expect to be stuck with this one.

The situation is improving. Partially by programmers using languages "better" defined than C with fewer or no undefined behaviours, and as mentioned better code analysis and warnings from compilers.

**s_j_newbury** · 30 March 2017, 06:21 PM

Originally posted by atomsymbol

A result: New Gentoo bug https://bugs.gentoo.org/show_bug.cgi?id=614282

That's what happens if you try to emerge for x32 on a non-x32 profile (stage3/existing system) since the toolchain isn't setup properly for x32 on Gentoo amd64. I think it would make a lot of sense if the amd64 profile always supported x32 at toolchain level like other distributions do, but it's not my call.

I've added a comment accordingly to the bug.

Announcement

GCC Compiler Tests At A Variety Of Optimization Levels Using Clear Linux

Comment

Comment

Comment

Comment

Comment

Comment