If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.
Also i remember reading an article about how big caches and clever precaching on modern CPUs meant that O3 was better than Os now. i think it was a report by intel. but i can't find it.
"native" should be the same as "core 2" for my hardware. I might include LTO last for GCC along with the ICC-specific settings suggested above for ICC. First going to run through the rest of the compilers though... it takes a long time .
I think we are approaching saturation of this dataset, which will lead to a final result being announced (and I can finally start updating my Arch install )
* Upon request:
- Include Os flags - works for ICC, GCC, Clang and Open64
- Include special settings: specific requests for ICC and GCC (LTO) currently.
- any other requests before the 64-bit tests are concluded to be finished?
* Stuff I am interested in:
- Check whether PCC can be tweaked for performance. I had an E-mail conversation with the current mantainer (Anders Magnusson) for possible flags.
- anyone an expert on TCC?
@Michael: Feel free to use these results for a Phoronix article if you want.
I think we are close to reaching saturation with this test set, which means that the only expansions that can be made are 1) different hardware, 2) more compilers.
When this series is concluded, I will start playing with 32-bit, where a number of other compilers are available in addition to those tested here (LCC, ACK, KenCC, SolarisStudio, OpenWatcom...)
Interesting and somewhat weird results here and there. I would say that as much as possible it's important to use the same flags across compilers, else the results will quickly become meaningless UNLESS you handtune the best settings for each compiler which is pretty difficult. Yes, sometimes -O2 generates faster code than -O3, but -O3 is supposed to generate the fastest code so having all compilers use that (or whatever goes as -O3 for them) would make most sense imo. Also it's a good thing to explicitly specify other things like -ffast-math since as ssam mentioned some compilers defaults to that which can make a big difference in many benchmarks.
Also, I think it would be best to either stick to -march=native or specify the exact system used -march=<system>, do not bother with -mtune.
Interesting seeing clang compiling p7zip, last time I checked it failed, time build a new version from svn.
I'd be interested kind of interested to see how DMC stacks up these days. (Does it build on Linux? I know DMD does, but....)
I checked the webpage for a linux version of dmc but did not find one. Perhaps it can compile with winelibs, but I did not find anyone trying.
@ the rest suggesting new flags: Thanks. I will take those points into account during the -Os rounds and later for 32-bit tests. If they make a big impact I might repeat some other tests with the new flags. I think, however, that really compiler-specific tweaks are a bit out of the scope of a very broad investigation like this and would probably be more interesting when specifically comparing two compilers, like ICC vs GCC for example.
I know it's a bit late now, but my suggestion would be to ditch C-Ray as it's just a straight line across all compilers anyway. POV-Ray would have given more interesting results probably.
You are probably correct. I am just running the "compiler" suite at the moment for these tests. It has basically grown organically from my first announcement and swollen larger than I ever could have imagined.
All suggestions are welcome
I am currently looking forward to concluding this round of tests though so that I can start with 32-bit.
Under my previous TODO are the stuff that I have planned to check before concluding. If anyone got more suggestions, you better come up with them before I have run all the things on the current TODO and announce the test to be concluded/saturated (because after that I am updating my OS and subsequent analyses will not be comparable).