Originally posted by atomsymbol
Announcement
Collapse
No announcement yet.
GCC 6.1 Compiler Optimization Level Benchmarks: -O0 To -Ofast + FLTO
Collapse
X
-
Originally posted by float View Post
No reason not to, that is, for well made programs.
Comment
-
Originally posted by carewolf View PostYes there is. Ofast enables illegal optimizations.
Originally posted by carewolf View PostYou should not use it unless you are certain the program doesn't rely on correct FP handling.
Originally posted by carewolf View PostEnable it on a Javascript engine, and it stops working.
Comment
-
Originally posted by float View PostAre some optimisations legal and illegal now? First time I hear such a wild claim.
On the other hand, if you first check for non-null pointer and only then dereference it, a conforming compiler may not optimize away this check, unless it can establish some other way that the pointer is non-null.
Originally posted by float View PostDefine correct FP handling. As far as I am concerned Ofast does not enable any optimisations about FP that do not conform to the C standard.
Comment
-
Originally posted by atomsymbolIf -fstrict-aliasing is enabled and the C compiler prints a warning message and the message is ignored by the user, then the generated code may be invalid.
Omitting -fno-strict-aliasing results in miscompiled LLVM+Mesa when -flto is used.
Originally posted by chithanh View PostFor example, if you dereference a pointer and then check whether it was null, it is legal to optimize away the check (because dereferencing null leads to undefined behavior).
On the other hand, if you first check for non-null pointer and only then dereference it, a conforming compiler may not optimize away this check, unless it can establish some other way that the pointer is non-null.
-Ofast implies -ffast-math which can lead to unexpected loss in precision, rounding going wrong, etc.
Originally posted by atomsymbolIn my opinion, a conforming compiler can do any transformation that preserves the semantics of the original code, including in some cases applying the optimization you just mentioned.
Comment
-
Originally posted by float View PostSo, by saying legal do you define optimisations that are allowed by the standard?
Originally posted by float View PostIn that case, what makes you think that the optimisations enforced by -ffast-math are not allowed by the C standard?
- Likes 1
Comment
-
Originally posted by chithanh View PostYes. In the example mentioned above, the compiler optimizes away a conditional branch because it has determined at compile time that in conforming code, the comparison is always false.
The C standard references IEEE 754 for floating point math. -ffast-math changes behaviour in a way that is not allowed by IEEE 754.
https://gcc.gnu.org/wiki/FloatingPointMath
defines __STDC_IEC_559__ shall conform to the specifications in this annex. 356)" and "356) Implementations that do not define __STDC_IEC_559__ are not required to conform to these specifications.". To my knowledge GCC does not define anything like that (also see https://gcc.gnu.org/c99status.html, "GCC does not define __STDC_IEC_559__ or implement the associated standard pragmas").
Comment
-
Looks like the short answer wasn't sufficient then.
Even if one assumes that "well made programs" don't rely on IEEE 754 behaviour in the absence of __STDC_IEC_559__, you can have it straight from the horse's mouth:
Originally posted by man gcc-OfastDisregard strict standards compliance. -Ofast enables all -O3 optimizations. It also enables optimizations that are not valid for all standard-compliant programs. It turns on -ffast-math and the Fortran-specific -fno-protect-parens and -fstack-arrays.
Comment
-
Originally posted by Fry-kun View PostHow come there's no LTO for GraphicsMagic?
And ImageMagick's LTO looks like a bad regression...
1) every object file contains both LTO intermediate code and final assembly (to make non-plugin-aware binutils grok them). This doubles compile time as useless binaries are produced
2) when static libraries are used, the LTO intermediate code is silently ignored nullyfing any LTO benefits
3) resolution info is not available to the compiler. This forces compiler to expect that every single symbol can be touched by non-LTO world and thus serve as an optimization boundary. This prevents a lot of useful code transformations.
Other possible explanation may be that Michael uses parallel build but LTO linking is run serially. In that case in addition to -j=n passed to Makefile you want to also use -flto=n. Imagemagick test is compile time benchmark and the regression probably comes from one of these reasons. In general LTO buidls are slower, but not by big margin (i.e. it is similar difference as between -O1 and -O2). You need more memory and disk space.
Comment
Comment