Benchmarking GCC 4.2 Through GCC 4.8 On AMD & Intel Hardware

XorEaxEax replied

24 July 2012, 07:34 PM
Originally posted by Brane215 View Post

These tests oftne land in "Meh, whatever" category, at least for me.

Sadly yes for me too.

Originally posted by Brane215 View Post

For example, whole point fo using gcc-4.7x for me is -flto optimisation.

Well, -flto was in earlier versions of GCC (introduced in gcc 4.5 iirc) so it's not something new to 4.7, also I personally haven't had huge impacts from -flto, however the binaries tend to be quite a bit smaller. Profile-guided-optimization on the other hand usually gives me a very noticeable speed increase however it does require more work than just adding a flag.

Looking at these tests, many of them still doesn't list any optimization flags, meaning they could very well be done with -O0 or -O1, thus making them totally pointless for a compiler optimization benchmark.

If we look at those few where we see the optimization flag listed (-O3), the results we often get a nice performance increase with the later GCC's, look at FFTE, C-Ray, POV-Ray, but the vast majority of the tests doesn't have any optimization flags listed and as such we don't know if those benchmarks are of any consequence whatsoever in terms of compiler optimization.

Then of course we have x264 and VP8, codebases which rely heavily on finely optimized assembly code for the performance critical parts, unless you explicitly disable the assembly code when configuring, these tests are totally worthless in terms of comparing compiler optimization. It's even more sad given that comparing how the compilers optimize this code would actually be very interesting results.

Originally posted by Brane215 View Post

Also, when finding regressions, it would be nice to go in-depth for their cause. Is error on the part of compiler, or simply program infrastructure misunderstood some compilers new feature, for example ?

Yes, but we don't even know if they are regressions if we don't know the actual optimization flag, GCC defaults to -O0 which is for debugging as it turns off all optimizations, so if -O0 generates slower code between versions it's hard to call it a regression as -O0 does nothing to improve code performance.

Now, given that -O0 is the default for GCC, unless we actually set a -On optimization level the tests will be done with no optimization, thus rendering them useless. So when Michael omits reporting any compiler optimization flags it's simply impossible to tell if the benchmarks have any value whatsoever.
Leave a comment:
Brane215 replied

24 July 2012, 02:14 PM
I'd love to see mor in-depth tests.

These tests oftne land in "Meh, whatever" category, at least for me.

I would like to see more thought-out in-depth tests.

For example, whole point fo using gcc-4.7x for me is -flto optimisation.

It would be nice to see what it can bring to the table when compiling programs from many sources and compilation units which are then linked into final library and/or executeable.

Of course, right programm to test this is not something like tar but something more complex.

Also it would be nice to see and compare used resources and final result during flto compilation and linking. flto has been notorious for eating memory and CPU cycles when compiling chrome or openoffice. It would be nice to see how much has this impact with gcc-4.5* - gcc-4.8*

Also, when finding regressions, it would be nice to go in-depth for their cause. Is error on the part of compiler, or simply program infrastructure misunderstood some compilers new feature, for example ?
Leave a comment:
VinzC replied

24 July 2012, 11:38 AM
Thanks a lot for this benchmarking, Michael. As a Gentoo user I know what to do.

But something must be done about the horrible assortments of colours, e.g. whenever red is used. I am aware (from other threads) it looks easier said than done but the current color assortment really impairs reading those graphs.
Leave a comment:
smitty3268 replied

24 July 2012, 12:58 AM
Originally posted by devius View Post

But why didn't those changes also affect Intel systems the same way?

I think someone here once said that the CRay results were heavily linked with how aggressively the compiler was at inlining code (with more inlining = faster), so it's possible that affects each processor differently. (Differently sized caches, branch predictors, etc.)
Leave a comment:
Vadi replied

23 July 2012, 11:14 PM
Pretty good improvements there - big thanks to everyone who contributes to GCC.

I love it when a product keeps being developed and improved.
Leave a comment:
liam replied

23 July 2012, 08:15 PM
Originally posted by devius View Post

But why didn't those changes also affect Intel systems the same way?

My guess would be that intel's architecture , as far as smp is concerned, hasn't changed in ages (since around nehalem's qpi).
Leave a comment:
devius replied

23 July 2012, 06:38 PM
Originally posted by crazycheese View Post

(1) In 4.7 Gcc recieved SMP improvements, for example bulldozer rendering went up around 50%. There was an article on it.

But why didn't those changes also affect Intel systems the same way?
Leave a comment:
bug77 replied

23 July 2012, 03:41 PM
Originally posted by devius View Post

Now the question is why?

Basically, because there's no free lunch. Few optimizations will improve performance across the board. Uusually you win some, you gain some.
Leave a comment:
crazycheese replied

23 July 2012, 03:34 PM
Originally posted by devius View Post

Now the question is why? I wish Phoronix had a big editorial staff...

(1) In 4.7 Gcc recieved SMP improvements, for example bulldozer rendering went up around 50%. There was an article on it.
Leave a comment:
devius replied

23 July 2012, 02:04 PM
Originally posted by bug77 View Post

A couple of interesting things I see:
1. In C-Ray, intel was faster with gcc 4.2, amd ends up faster with gcc 4.7/4.8.
2. In FLAC audio encoding, the stiuation is actually reversed (though the differences are smaller this time): amd start out on top end ends up at the bottom.

Now the question is why? I wish Phoronix had a big editorial staff...
Leave a comment:

Announcement

Benchmarking GCC 4.2 Through GCC 4.8 On AMD & Intel Hardware

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment: