Announcement

Collapse
No announcement yet.

Optimizing Mesa Performance With Compiler Flags

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • phoronix
    started a topic Optimizing Mesa Performance With Compiler Flags

    Optimizing Mesa Performance With Compiler Flags

    Phoronix: Optimizing Mesa Performance With Compiler Flags

    Compiler tuning can lead to performance improvements for many computational benchmarks by toying with the CFLAGS/CXXFLAGS, but is there much gain out of optimizing your Mesa build? Here's some benchmark results...

    http://www.phoronix.com/vr.php?view=MTI4NTY

  • curaga
    replied
    I've always* built Mesa with -O3 and not once had an issue that was because of that.

    * not built git in the last 3-4 months since it requires newer autofoo and I'm too lazy.

    Leave a comment:


  • Kayden
    replied
    Originally posted by Lockal View Post
    I guess the bottleneck of most videogames is not OpenGL, unless the game is designed for high-end graphics card. Check this with any profiler: gl... calls are almost unnoticeable amoung game physics and logic. Compiling the actual software and main libraries instead of driver could give a very different result.
    Not in my experience. I've run a lot of benchmarks and games, and 'sysprof' often shows that _mesa_* calls (which are the actual implementation of the gl* calls) are a very noticable percentage.

    Leave a comment:


  • smitty3268
    replied
    Originally posted by Adarion View Post
    Question is indeed if mesa is speed limiting step (aka bottleneck) in the whole system here. But it won't hurt to keep my Gentoo CFLAGS like they are. Mainly march set and -O2. In few cases I actually use -Os for VIA CPUs or AMD's old Geode LX. Few packages might dislike messing too much with CFLAGS though.
    It's much more likely to be with faster GPUs and lower resolutions. Michael testing an IGP at 1080p probably isn't going to show a lot.

    Leave a comment:


  • smitty3268
    replied
    Originally posted by mark_ View Post
    ok, makes sense. But shouldn't the programmer use inline functions or macros in this case?
    I guess I will add the inline parameter to my CXXFLAGs and for single C packages.
    Function inlining varies a lot between software. In some cases, it gives huge speedups. Other times, it just results in slower performance and greater memory use. It can vary depending on how large your CPU cache is as well.

    You can even manually set the depth the compiler will inline down to - something Firefox does for example, because the default -O3 inlining was too much, but by limiting the inlining amount they could still turn on -O3 and get better results than plain old -O2.

    Leave a comment:


  • smitty3268
    replied
    Originally posted by nej_simon View Post
    Then why not use something like -march=i686 -msse -msse2? That would enable gcc to use cmov and sse/sse2 instructions and the binaries would still run on a P4.
    The v2 patch now has these options, and will almost certainly get approved.

    -march=pentium4 -mtune=core2 -mfpmath=sse


    Actually that looks like a typo - the patch comments talk about sse2, but the patch itself just enables sse.
    Last edited by smitty3268; 01-29-2013, 12:09 AM.

    Leave a comment:


  • duby229
    replied
    My understanding is that right now the biggest bottleneck in the oss graphics stack is GEM/TTM. It needs replaced, but I don't think anybody has a good idea on what to replace it with.

    Leave a comment:


  • Adarion
    replied
    Question is indeed if mesa is speed limiting step (aka bottleneck) in the whole system here. But it won't hurt to keep my Gentoo CFLAGS like they are. Mainly march set and -O2. In few cases I actually use -Os for VIA CPUs or AMD's old Geode LX. Few packages might dislike messing too much with CFLAGS though.

    Leave a comment:


  • fuzz
    replied
    It would be nice to have a database/list of programs and their fastest compile flags (depending on the compiler/version of course).

    Leave a comment:


  • Rigaldo
    replied
    Originally posted by mark_ View Post
    This affects C also, it looks like a function call is replaced by the function code. This should result in less stack usage but the function has to be so simple that creating a new stack entry costs more performance than executing the function. Seems to be relatively useless.
    Actually, since the functions code would be executed anyway, you should always gain performance from avoiding the new stack entry. The main drawbacks they try to avoid are probably bigger binaries, more memory usage for very large functions.
    And it could potentially allow even more optimization with the "neighbouring" code, since it's not isolated in a function anymore. There way too many things to consider in compiler optimization.

    Leave a comment:

Working...
X