Announcement

**Delgarde** · 08 November 2010, 07:29 PM

[QUOTE=XorEaxEax;155688]While these tests are great (kudos Phoronix!) it's unfortunate that they don't test some of the more advanced optimizations that has come during the later releases. While testing PGO (profile-guided optimization) would be a bit unfair since Clang/LLVM doesn't have this optimization.../QUOTE]

How would that be unfair? What's the point in comparing either compiler with anything less than it's strongest capabilities? If Clang/LLVM doesn't do PGO, that's their problem, nobody elses...

**smitty3268** · 08 November 2010, 08:02 PM

[QUOTE=Delgarde;155798]

Originally posted by XorEaxEax View Post

While these tests are great (kudos Phoronix!) it's unfortunate that they don't test some of the more advanced optimizations that has come during the later releases. While testing PGO (profile-guided optimization) would be a bit unfair since Clang/LLVM doesn't have this optimization.../QUOTE]

How would that be unfair? What's the point in comparing either compiler with anything less than it's strongest capabilities? If Clang/LLVM doesn't do PGO, that's their problem, nobody elses...

The issue with testing PGO is that you have to train the application, which can introduce all sorts of complications into testing. Ideally, the test framework itself would be able to script something but that's a lot of work.

**XorEaxEax** · 08 November 2010, 09:11 PM

Yes, the downside with PGO is that it's not just adding another flag and away we go. It needs to gather necessary data about how the program runs which means it's a two-stage process. First you compile it using -fprofile-generate which inserts alot of information-gathering code into your program, you then run the program and try to touch as much parts of the code as possible (not like going through every level in a game but rather to make sure different parts of the code are executed), once you exit from the compiled program it will dump all the gathered data into files which are then used in the second (final) stage of compilation (-fprofile-use). Here all the gathered data provides a plethora of information for the compiler to use when judging what/when and how to optimize.

From my experience PGO usually brings ~10-20% performance increase on cpu intensive code which is a real fine boon, but the two stage-compilation process makes it a non-trivial optimization to use. Hence it's most often applied on projects that really need all the performance they can get, encoders, compressors, emulators etc.

**XorEaxEax** · 08 November 2010, 09:24 PM

And like Smitty said, if you plan on routinely using you should likely make a script to automate it, I know projects like Firefox and x264 does this.

**Shining Arcanine** · 08 November 2010, 11:32 PM

Originally posted by XorEaxEax View Post

Well, in some tests -O3 loses to -O2, but very slightly. But this is a test from a year ago and I can't even find which version of Gcc was used, nor can I see if it was done on 32bit or 64bit. I test alot of packages routinely (Blender, p7zip, Handbrake, Dosbox, Mame etc) with -O2 and -O3 and O3 comes out on top.

Usually -O3 will lose to -O2 when there are only a megabyte or two of L2 and L3 cache. If the L2 and L3 cache are say 128KB, then not only will -O3 lose to -O2, but -O2 will lose to -Os.

**Rob72** · 09 November 2010, 06:15 AM

Originally posted by Yezu View Post

I agree with that, also some other proprietary compilers might be compared (IBM, HP, CodeWarrior).

Also what about some ARM compiler benchmarks?

I agree, that would be very interesting. The problem though is that they target different architectures.

I have no experience with CodeWarrior, but they seem to target embedded platforms.

I would add PathScale to the list, they have x86 compilers, and used to be our favourite in the past with AMD systems. But lately it is all Intel.

Also the IBM compilers should be good, but not available for x86. So assuming you have access to a POWER or PowerPC machine, you can only compare it to GCC on the same machine.

**redi** · 09 November 2010, 02:44 PM

GCC 4.6 compile times

Maybe I missed it, but I didn't see any mention of using --enable-checking=release or --disable-checking for the GCC 4.6 snapshot build.

By default snapshots have lots of checks, which make compile time MUCH slower. Those checks are disabled for releases. That's presumably the equivalent of Clang's --disable-assertions

If you didn't build GCC 4.6 without checking, that would definitely explain the slow compile times for 4.6

**smitty3268** · 09 November 2010, 03:13 PM

Originally posted by redi View Post

Maybe I missed it, but I didn't see any mention of using --enable-checking=release or --disable-checking for the GCC 4.6 snapshot build.

By default snapshots have lots of checks, which make compile time MUCH slower. Those checks are disabled for releases. That's presumably the equivalent of Clang's --disable-assertions

If you didn't build GCC 4.6 without checking, that would definitely explain the slow compile times for 4.6

I was also wondering if that was the case, those compile times just seem to out of whack otherwise.

**XorEaxEax** · 09 November 2010, 09:52 PM

Originally posted by smitty3268 View Post

I was also wondering if that was the case, those compile times just seem to out of whack otherwise.

Well, it's obviously either alot of debug code and/or a massive regression. AFAIK this snapshot was the last one before the feature freeze so I guess anything could have been thrown in last minute

**ssam** · 11 November 2010, 01:35 PM

i tested gcc 4.3, 4.4 and 4.5 (shortly before its release) for a fortran code i use. -O3 beats -O2, and there is a trend of improvement between releases.

403 Forbidden

http://www.hep.man.ac.uk/u/sam/zgoubi-optimise/

Announcement

Compiler Benchmarks Of GCC, LLVM-GCC, DragonEgg, Clang

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment