After recently comparing the AMD Bulldozer with the GCC, Open64, and LLVM/Clang compilers, in this article is a look at the performance of AMD's Open64 compiler when using their recommended compiler tuning options for Bulldozer when building software.
This article tests the various compiler options that AMD recommends as using for the CFLAGS/CXXFLAGS per their "Compiler Options Quick Reference Guide" for AMD Opteron Interlagos (Bulldozer) CPUs. AMD puts out the various optimizations in a very concise guide (this is a great example for those that have requested specific compiler tests in the past, where I have said to assemble a page on a Wiki or other documentation that details each recommended option for a particular environment).
In this article is just a look at AMD's recommended compiler options for the Open64 compiler.
The options tested included stock (not overriding any CFLAGS/CXXFLAGS and Open64 defaults to the -O2 optimization level), no optimizations (-O0), O1 local optimizations (-O1), O3 aggressive optimizations (-O3), bdver1 (Bulldozer optimizations for the march/mtune switches; -march=bdver1 -mtune=bdver1), auto-parallelization (-apo), huge pages (-HP). loop nest optimizations (-LNO:prefetch -LNO:prefetch_ahead), multi-core scalability (-mso), and LNO prefetch. Each of the C/C++ tests were re-installed before testing and built with the respective CFLAGS/CXXFLAGS set each time (using the force-install command from the Phoronix Test Suite). This article is not looking at the performance when pairing various compiler options together.
The AMD Open64 126.96.36.199 compiler was used from an Ubuntu 11.10 installation on the Linux 3.1 kernel from the AMD FX-8150 Eight-Core test system.
This article is just the second of several compiler test articles that are forthcoming from AMD's new Bulldozer platform.