Originally posted by unquaid
View Post
Announcement
Collapse
No announcement yet.
LLVM Clang 3.8 Compiler Optimization Benchmarks With -Ofast
Collapse
X
-
-
It's not hard to write new code, even new scientific code, that plays perfectly well with -Ofast/-ffast-math, and benefits from it. You just have to make sure you avoid things like, for example, subtracting two large nearly equal numbers and expecting their difference to retain a certain number of significant bits.
Yes, lots of legacy code might not play very nice with it. However, not playing nice with -ffast-math is a symptom of a deeper problem. If code doesn't give the same answers to within the machine noise when compiled with and without -ffast-math, then it shouldn't be expected to give the same results on, say, x86 and ARM, or a CPU and GPU. So the math in the code should probably be refactored anyway.
I've generally seen that code that gives radically different answers when run with reduced precision in a particular intermediate value can also give a radically different answer when run with INCREASED precision. So the original "expected" answer, when rigidly following the IEEE floating point standard, had little to do with the mathematically correct answer in the first place.
The question to answer, rather than just declaring that -Ofast is never appropriate for a given field of study, is whether a particular codebase is too big and complicated to be made safe when using these optimizations. The benefits are real, and the accumulated time, power consumption, &c. saved by using them may well offset the developer effort required.
- Likes 1
Leave a comment:
-
Originally posted by caligula View Post
It's great for games and demos (with decent QA), but not for scientific apps.
- Likes 1
Leave a comment:
-
Interesting, how much better the performance will be, if generate and use profile?
-fprofile-generate -fprofile-use http://clang.llvm.org/docs/UsersManu...d-optimizationLast edited by unquaid; 08 February 2016, 05:10 PM.
Leave a comment:
-
Originally posted by caligula View Post
It's great for games and demos (with decent QA), but not for scientific apps.
I absolutely hate finding that the tool that is supposed to do the work is broken. Usually costs way more hours than expected because you always assume that the code in question was broken in the first place.
- Likes 1
Leave a comment:
-
Originally posted by milkylainen View PostI don't agree that -Ofast is a valid optimization target. It's more of a play/testing target. Never unless you're dead sure about your code.
It's like saying "how fast can we make things if we break compliance", especially with floating point. fast-math can introduce some serious accuracy issues and a whole other bunch of problems. Don't know about the other optimizations.. but I assume some of them can be unsafe, otherwise they'd probably be enabled already by -O3.
The same obviously goes for GCC.
Leave a comment:
-
I had to look up -Oz as it's the first time I've heard of it:
-O3 : throw everything and hope it sticks
-O2 : optimized build, but should not explode in code size nor consume all resources while compiling
-O1 : optimized debug binaries, don't change the execution order but remove dead code and stuff
-O0 : don't touch it
-Os : optimize, but don't run passes that could blow up code. Try to be a bit more drastic when removing code. When in doubt, prefer small, not fast code.
-Oz : only perform optimizations that reduce code size. Don't even try to run things that could potentially increase code size.
From Renato Golin Linaro
It would be great if you could run all the tests with -march=native next time
Leave a comment:
-
I don't agree that -Ofast is a valid optimization target. It's more of a play/testing target. Never unless you're dead sure about your code.
It's like saying "how fast can we make things if we break compliance", especially with floating point. fast-math can introduce some serious accuracy issues and a whole other bunch of problems. Don't know about the other optimizations.. but I assume some of them can be unsafe, otherwise they'd probably be enabled already by -O3.
The same obviously goes for GCC.
Leave a comment:
-
LLVM Clang 3.8 Compiler Optimization Benchmarks With -Ofast
Phoronix: LLVM Clang 3.8 Compiler Optimization Benchmarks With -Ofast
A few days ago I posted a number of LLVM Clang optimization level benchmarks using the latest code for the upcoming Clang 3.8 release. Those tests went from -O0 to -O3 -march=native, but many Phoronix readers wanted -Ofast so here are those results too...
http://www.phoronix.com/scan.php?pag...lang-3.8-OfastTags: None
Leave a comment: