LLVM Clang 12 Benchmarks At Varying Optimization Levels, LTO
As we are used to seeing, hitting at least the -O2 optimization level allows for a bulk of the compiler performance optimizations. But in cases like Crypto++, making use of "-march=native" is of measurable benefit. The link-time optimizations here were also of some minor help.
The Botan crypto benchmarks meanwhile failed to build with the Clang LTO option but was enjoying nice benefits from "-march=native" targeting.
MrBayes and HMMer are two of the tests having a measurable boost with -Ofast compared to -O3, but with the caveat that using -Ofast can lead to potentially unsafe math.