Announcement

**Brane215** · 14 April 2016, 12:44 PM

Originally posted by Azpegath View Post

I think -O3 is a bit harsh since it is generally advised NOT to use O3 since it can actually result in incorrect calculations, etc. -O2 is a more reasonable choice, don't you think?

It shouldn't in theory, but in practice it is more likely to expose some cracks in compiler. O3 is just O2 with extra optimisations that churn CPU power and do not always deliver better end result. With gcc-5 and later, compiler is smarter, so O3 just means bigger probability of breakage and slower compile.

But:

1. New compilers (gcc-5, gcc-6) should be less problematic with -O3 than gcc-4 etc.

2. flto is main selling point of new compilers and flto needs aggressive optimisations to make a nice difference. FLTO brings to the table the ability to optimize across compile units, which doesn't mean much if you can not optimize much anyway.

IMO there is not much point in agressively optimizing every bit of the system. Most of the things can be -O2 non-flto and then just do flto on selected big, complex, deep libraries and packages, which can then bring substantial benefits.

**Michael** · 14 April 2016, 01:23 PM

Originally posted by VikingGe View Post

-O3 -march=native is usually the default for Phoronix tests, isn't it? I mean, no point in comparing the speed of unoptimized code.

The only reason people recommend not to use -O3 is that it often breaks incorrect code, i.e. code that relies on undefined behaviour to behave in a specific way, which it may do with -O2. Can't blame the compiler for that.

Right, if it's not -O3 -march=native, it's usually mentioned on the graphs.

**SaucyJack** · 14 April 2016, 02:35 PM

Originally posted by Azpegath View Post

I think -O3 is a bit harsh since it is generally advised NOT to use O3 since it can actually result in incorrect calculations, etc. -O2 is a more reasonable choice, don't you think?

-O3 is perfectly safe to use in my experience. Free performance and efficiency.

**wagaf** · 14 April 2016, 10:40 PM

Originally posted by Brane215 View Post

IMO there is not much point in agressively optimizing every bit of the system. Most of the things can be -O2 non-flto and then just do flto on selected big, complex, deep libraries and packages, which can then bring substantial benefits.

Code is compiled once and runs millions of times on user machines.
Phoronix benchmarks showed that O3 can bring a few percent of additional performance, on mobile that may mean being able to send a crucial message before the device goes out of battery, on a server, being able to serve a few more users. No rational reason to not use the maximum safe optimisation level in production.

Many practices in IT are actually driven by superstition (that used to have a reason to be), like the belief that using -O3 is dangerous. If it breaks at O3, it's just a bug in your program. O3 can be combined with lto for additional gains.

**SaucyJack** · 14 April 2016, 11:27 PM

Originally posted by Brane215 View Post

It shouldn't in theory, but in practice it is more likely to expose some cracks in compiler. O3 is just O2 with extra optimisations that churn CPU power and do not always deliver better end result. With gcc-5 and later, compiler is smarter, so O3 just means bigger probability of breakage and slower compile.

But:

1. New compilers (gcc-5, gcc-6) should be less problematic with -O3 than gcc-4 etc.

2. flto is main selling point of new compilers and flto needs aggressive optimisations to make a nice difference. FLTO brings to the table the ability to optimize across compile units, which doesn't mean much if you can not optimize much anyway.

IMO there is not much point in agressively optimizing every bit of the system. Most of the things can be -O2 non-flto and then just do flto on selected big, complex, deep libraries and packages, which can then bring substantial benefits.

Not much point? That's the dumbest comment so far. It makes such a huge difference that ~Intel created their own Distro~. And please explain to me how it's aggressive exactly? Does pushing the '3' key over the '2' key cause a kitten's head to explode somewhere? Do the terrorists win if someone pushes '3' over '2'? I'm at a loss to explain this logic.

**Brane215** · 15 April 2016, 02:45 AM

Originally posted by SaucyJack View Post

Not much point? That's the dumbest comment so far. It makes such a huge difference that ~Intel created their own Distro~. And please explain to me how it's aggressive exactly? Does pushing the '3' key over the '2' key cause a kitten's head to explode somewhere? Do the terrorists win if someone pushes '3' over '2'? I'm at a loss to explain this logic.

It can make some packages misscompile - they have hidden flaws. Which sometimes show later. And usually requires many hours of chasing bugs through many of already installed packages. And final speed gain with -O3 can still be _negative_ in the end. I have IIRC 1200+ packages on my workstation machine and I remember having quite a few exceptions that had to be compiled with tweaked C/XX/LD/FLAGS.

It clearly shows that you don't have a clue what you are talking about. If it was such no-brainer, it would really be a default option on gcc and diistros like Gentoo couldn't wait to demand it as a default option. But they are not, for good reason - it simply is not worth the effort in the end in their case. Each such breakage would mean new bugs, each bug would mean that much extra man-hours on chasing it and, more importantly, they tend demand work from several teams ( like the team, supporting package X has to cooperate with teams, responsible for libraries Y and Z etc).

So in practice, with any deviations from "standard" -O2 and especially with "-flto" you are on your own.

I never particularly cared for running to flocks like sheeps and doing things simply because Intel/whoever is doing something in some way. Intel is so loaded with $$$ that they could hire Shaolin monks to do some parts of compile by hand on papyrus with beautifull calligraphy jus for fun of it. This doesn't mean that should do it, too.

**Brane215** · 15 April 2016, 02:56 AM

Originally posted by wagaf View Post

Code is compiled once and runs millions of times on user machines.
Phoronix benchmarks showed that O3 can bring a few percent of additional performance, on mobile that may mean being able to send a crucial message before the device goes out of battery, on a server, being able to serve a few more users. No rational reason to not use the maximum safe optimisation level in production.

If you are doing such sensitive applications then you should be neck deep in optimisation techniques on the level of kernel gurus and you surely wouldn't need Phoronix article to show you how to flip a CFLAG or switch a compiler.

I think userland code is on average so full of bloat that anyone seriously chasing performance should start there. People are doing stupid sh*t beyond belief. No compiler can correct that. Flipping a flag or two for final compile can be final polish, not universal solution.

**Azpegath** · 15 April 2016, 03:06 AM

Originally posted by Brane215 View Post

It can make some packages misscompile - they have hidden flaws. Which sometimes show later. And usually requires many hours of chasing bugs through many of already installed packages. And final speed gain with -O3 can still be _negative_ in the end. I have IIRC 1200+ packages on my workstation machine and I remember having quite a few exceptions that had to be compiled with tweaked C/XX/LD/FLAGS.

It clearly shows that you don't have a clue what you are talking about. If it was such no-brainer, it would really be a default option on gcc and diistros like Gentoo couldn't wait to demand it as a default option. But they are not, for good reason - it simply is not worth the effort in the end in their case. Each such breakage would mean new bugs, each bug would mean that much extra man-hours on chasing it and, more importantly, they tend demand work from several teams ( like the team, supporting package X has to cooperate with teams, responsible for libraries Y and Z etc).

So in practice, with any deviations from "standard" -O2 and especially with "-flto" you are on your own.

I never particularly cared for running to flocks like sheeps and doing things simply because Intel/whoever is doing something in some way. Intel is so loaded with $$$ that they could hire Shaolin monks to do some parts of compile by hand on papyrus with beautifull calligraphy jus for fun of it. This doesn't mean that should do it, too.

Thanks for two great answers, I didn't know that GCC had measurably improved reliability on their optimizations since GCC4.x. I'm also running Gentoo, and my hesitation towards -O3 was just that: Earlier on the Gentoo devs recommended to not run -O3 since it more often resulted in package breakage, and wasn't worth the hassle in the end.

**wagaf** · 15 April 2016, 06:02 PM

Originally posted by Brane215 View Post

If you are doing such sensitive applications then you should be neck deep in optimisation techniques on the level of kernel gurus and you surely wouldn't need Phoronix article to show you how to flip a CFLAG or switch a compiler.

I think userland code is on average so full of bloat that anyone seriously chasing performance should start there. People are doing stupid sh*t beyond belief. No compiler can correct that. Flipping a flag or two for final compile can be final polish, not universal solution.

Those are completely different things. Enabling O3 is free and gives instant speedup. That's what the discussion is about.
Going "neck deep in optimisation techniques" is weeks of work for a single program. Rewriting userspace would take decades to the whole community.
All those things can, and do, happen at the same time. I don't see the point to present them as if they where alternatives. 2 minutes of additional battery thanks to O3 is still 2 minutes of additional battery, even if 10 more minutes could have been gained by spending months in optimisations.

I mentioned Phoronix benchmarks because of your claim that O3 "do not always deliver better end result". The fact is that O3 consistently delivers the best performance, sometimes on par with some unpredictable other optimisation level.

**RamblingMadMan** · 15 April 2016, 11:43 PM

Originally posted by Brane215 View Post

flto is main selling point of new compilers and flto needs aggressive optimisations to make a nice difference. FLTO brings to the table the ability to optimize across compile units, which doesn't mean much if you can not optimize much anyway.

IMO there is not much point in agressively optimizing every bit of the system. Most of the things can be -O2 non-flto and then just do flto on selected big, complex, deep libraries and packages, which can then bring substantial benefits.

It isn't called "flto" it's called lto: link time optimization. The f is a part of the GCC switch.

Originally posted by Brane215 View Post

With gcc-5 and later, compiler is smarter, so O3 just means bigger probability of breakage and slower compile.

LTO runs a much higher risk of longer build times, broken builds and compiler bugs showing up. O3 is a near free speed up and is tested far more than lto. If your software breaks when you compile with O3 it is probably your fault for not conforming to standards, and if it is the compilers fault you can file a bug and the brilliant GCC devs will fix such a major issue quick fast.

Announcement

GCC 4.9 vs. 5.3 vs. 6.0 Compiler Benchmarks On Debian 8.4

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment