Announcement

**jabl** · 02 May 2024, 04:41 AM

Originally posted by ms178 View Post

For someone not familiar with coding or not able to read the code, how am I supposed to know if the code depends on correct IEEE754 behavior?

If the developer of the code in question doesn't endorse using -Ofast / -ffast-math (e.g. by having that option in the build configuration of the software in question), and you don't have the competence to analyze whether it's safe, then, well uh-oh, don't go and enable it?

**User42** · 02 May 2024, 06:43 AM

There's something missing in the article. I refuse to believe that totally optional but convenient flag would be removed on the ground that "people don't read the doc" (my paraphrasing).

I'm all for providing better UX but such a flag was introduced by GCC to get close to the convenience of the Intel Compiler -xfast to compile high-performance computing workload (and I guess a lot of AI stuff these days). They are not equivalent but it gets a long way there for numerical workloads that don't heavily depend on IEEE compliance.

Removing it because incompetent morons, who choose their flags by hear instead of reading the doc, use it when they shouldn't merely displaces the problem while being inconvenient for all current users.

**ms178** · 02 May 2024, 07:10 AM

Originally posted by jabl View Post

If the developer of the code in question doesn't endorse using -Ofast / -ffast-math (e.g. by having that option in the build configuration of the software in question), and you don't have the competence to analyze whether it's safe, then, well uh-oh, don't go and enable it?

Sure, that is another simple solution and I understand the reasoning behind it. The problem is that I don't want to leave any performance on the table and a large number of users are in the same boat that don't know about the details of the code base nor do users know if the devs care about Ofast or performance at all. My current practice is to follow what Clear Linux does or test it out experimentally on a package-by-package basis. I think this could be improved further, my dream would be to have an AI optimizing the source code to the machine where the code is executed, creating optimal code for the specific target.

**aviallon** · 02 May 2024, 07:31 AM

Originally posted by rene View Post

the usual modern "just change for the sake of change" it has been there for decades, and software uses it in production. No reason to change to make some students hello world FP math work like in the textbook. If this lands upstream I'll just revert it for our t2 package for high performance optimizations ,-) https://t2sde.org/packages/llvm

You just have to replace -Ofast with -O3 -ffast-math, and you're done.

**aviallon** · 02 May 2024, 07:34 AM

Originally posted by ms178 View Post

Sure, that is another simple solution and I understand the reasoning behind it. The problem is that I don't want to leave any performance on the table and a large number of users are in the same boat that don't know about the details of the code base nor do users know if the devs care about Ofast or performance at all. My current practice is to follow what Clear Linux does or test it out experimentally on a package-by-package basis. I think this could be improved further, my dream would be to have an AI optimizing the source code to the machine where the code is executed, creating optimal code for the specific target.

It already exists (genetic algorithms for automatic flag selection), but it is _super_ slow, and you must have a good synthetic benchmark with it too (as if you were doing PGO, but with a hundred more steps).

**ctbr** · 02 May 2024, 08:10 AM

My perception is there was some history here...

Earlier versions of SPEC CPU such as CPU 2000 had a set of run rules including 2.2.6 which limited the number of base options that could be used. This rule was later removed in CPU 2006 run rules.
One way of addressing this limit was added to the Intel compiler, "-fast" which was a shorthand for a combination of optimizations including some that met numerical guidelines from language standards at the time even if not fully IEEE 754 compliant.
gcc added -Ofast after -fast was already in the Intel compiler. gcc documents the set of optimizations that are performed as part of -Ofast. It includes a statement that these optimizations may not be fully standards compliant
llvm adds -Ofast after -Ofast was also in gcc. This seems to focus mostly on the -fast-math subset of unsafe optimizations - though Intel and gcc compilers had optimizations beyond -fast-math

Now llvm has this discussion on deprecating -Ofast as an option.

**Weasel** · 02 May 2024, 11:30 AM

LLVM proving why it's a joke as always.

**indepe** · 02 May 2024, 05:49 PM

Originally posted by aviallon View Post

You just have to replace -Ofast with -O3 -ffast-math, and you're done.

The proponents of the change didn't seem to know (or silently ignored) that -Ofast enables/disables more options than just -ffast-math, at least on GCC.

Which makes me wonder if they are really looking at the whole situation. Personally I haven't used -Ofast _yet_, however I'd expect this option is used by many who know what they are doing, and have determined that their software really doesn't need to be super compliant. I don't think it is necessarily a good strategy to break behavior for those who know what they do in favor of those who don't.

Plus it sounds more like guessing that there is a problem than really knowing how large it is. The proponents "think" there is a problem related to something called "my experience" of knowledge about the option's meaning. What does that mean? Is there a real problem resulting for them, or just an unspecified number of people who don't know what the option does exactly?

**archkde** · 03 May 2024, 04:21 AM

Good, we don't need a "break my program in interesting ways" option that sounds like a good thing. If you really want non-compliant math, just use -ffast-math yourself (which should probably be called -funsafe-fast-math itself…), and don't get shit like -fallow-store-data-races with GCC at least.

**ms178** · 03 May 2024, 01:13 PM

Originally posted by aviallon View Post

It already exists (genetic algorithms for automatic flag selection), but it is _super_ slow, and you must have a good synthetic benchmark with it too (as if you were doing PGO, but with a hundred more steps).

Thanks for letting me know, in researching this topic further I stumbled upon this Arxiv paper. Hopefully the proposed AI autotuning will become a reality soon.

Announcement

Proposal Raised To Deprecate "-Ofast" For The LLVM/Clang Compiler

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment