Announcement

Collapse
No announcement yet.

Compiler Benchmarks Of GCC, LLVM-GCC, DragonEgg, Clang

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • XorEaxEax
    replied
    And like Smitty said, if you plan on routinely using you should likely make a script to automate it, I know projects like Firefox and x264 does this.

    Leave a comment:


  • XorEaxEax
    replied
    Yes, the downside with PGO is that it's not just adding another flag and away we go. It needs to gather necessary data about how the program runs which means it's a two-stage process. First you compile it using -fprofile-generate which inserts alot of information-gathering code into your program, you then run the program and try to touch as much parts of the code as possible (not like going through every level in a game but rather to make sure different parts of the code are executed), once you exit from the compiled program it will dump all the gathered data into files which are then used in the second (final) stage of compilation (-fprofile-use). Here all the gathered data provides a plethora of information for the compiler to use when judging what/when and how to optimize.

    From my experience PGO usually brings ~10-20% performance increase on cpu intensive code which is a real fine boon, but the two stage-compilation process makes it a non-trivial optimization to use. Hence it's most often applied on projects that really need all the performance they can get, encoders, compressors, emulators etc.

    Leave a comment:


  • smitty3268
    replied
    [QUOTE=Delgarde;155798]
    Originally posted by XorEaxEax View Post
    While these tests are great (kudos Phoronix!) it's unfortunate that they don't test some of the more advanced optimizations that has come during the later releases. While testing PGO (profile-guided optimization) would be a bit unfair since Clang/LLVM doesn't have this optimization.../QUOTE]

    How would that be unfair? What's the point in comparing either compiler with anything less than it's strongest capabilities? If Clang/LLVM doesn't do PGO, that's their problem, nobody elses...
    The issue with testing PGO is that you have to train the application, which can introduce all sorts of complications into testing. Ideally, the test framework itself would be able to script something but that's a lot of work.

    Leave a comment:


  • Delgarde
    replied
    [QUOTE=XorEaxEax;155688]While these tests are great (kudos Phoronix!) it's unfortunate that they don't test some of the more advanced optimizations that has come during the later releases. While testing PGO (profile-guided optimization) would be a bit unfair since Clang/LLVM doesn't have this optimization.../QUOTE]

    How would that be unfair? What's the point in comparing either compiler with anything less than it's strongest capabilities? If Clang/LLVM doesn't do PGO, that's their problem, nobody elses...

    Leave a comment:


  • nanonyme
    replied
    Originally posted by smitty3268 View Post
    I suspect those tests where O2 outperformed O3 aren't very realistic. They probably have very small code bases that happen to fit into L1 with O2 and get enlarged a bit to only fit in the L2 cache with O3 optimizations, or something like that. Something that i imagine is mostly only true for microbenchmarks rather than a real application.
    Depends. Seems certain optimizations in Mesa drivers constituted of making structures smaller so they fit in caches. Caches are really significant in modern computing, hence why -Os is sometimes wicked fast even though it has even less optimizations meant for speed than -O2.

    Leave a comment:


  • monraaf
    replied
    Great article. IMHO more important than the benchmark results are the rather frequent occurrences where Clang/LLVM failed to compile something. There's a lot of talk out there how Clang/LLVM supposedly be better than GCC. Rather than some theoretical talk, this article brings some hard facts to the table: Clang/LLVM still fails miserably in what it's supposed to do, and where it does succeed the resulting binaries are often slower than GCC produced binaries.

    Leave a comment:


  • XorEaxEax
    replied
    Originally posted by Ex-Cyber View Post
    The Cyber Sled results are impressive; System 21 is a beast. Which Core i5 model is that, and how are you clocking it?
    Err.. how do I check model? cat /proc/cpuinfo only returns Core i5, no particular model as I can see. It's overclocked to 3.2ghz (original 2.67ghz).

    Leave a comment:


  • Ex-Cyber
    replied
    The Cyber Sled results are impressive; System 21 is a beast. Which Core i5 model is that, and how are you clocking it?

    Leave a comment:


  • XorEaxEax
    replied
    Originally posted by yotambien View Post
    That's interesting. What are the percentages? I mean, I suppose higher is better, but what are they? : D

    On the other hand, the PGO thingy looks like it actually makes a nice difference...
    Thanks for not rubbing it in ;D The percentages are relative to the game running in full speed (as in 100%), so in all these tests the emulated games run faster than what they should (-nothrottle makes it run as fast as it can). And yes PGO does make a difference in cpu intensive programs, the one standout here is Virtua Fighter Kids which only differs from the other games in that it's cpu emulation is done through a dynamic recompiler so obviously that benefits alot from some of the things PGO improves upon, like better branch prediction, loop unrolling, less cache trashing etc.

    Leave a comment:


  • yotambien
    replied
    That's interesting. What are the percentages? I mean, I suppose higher is better, but what are they? : D

    On the other hand, the PGO thingy looks like it actually makes a nice difference...

    Leave a comment:

Working...
X