Announcement

Collapse
No announcement yet.

Optimizing Mesa Performance With Compiler Flags

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    -Os is slower in some cases, tried it now on r200 and immidiately i can see slower menus in supertuxkart: going through kart chooser for example is slugish, so no go...

    From my experiance maybe -O1 is the best for mesa stability, but safe is to just go with -O2 and -pipe that will produce smaller libraries or if you want to play with processor optimisation then add -march=blabla , but always stick with -O2 if you want and to keep driver stability.

    Comment


    • #17
      Originally posted by mark_ View Post
      This affects C also, it looks like a function call is replaced by the function code. This should result in less stack usage but the function has to be so simple that creating a new stack entry costs more performance than executing the function. Seems to be relatively useless.
      Actually, since the functions code would be executed anyway, you should always gain performance from avoiding the new stack entry. The main drawbacks they try to avoid are probably bigger binaries, more memory usage for very large functions.
      And it could potentially allow even more optimization with the "neighbouring" code, since it's not isolated in a function anymore. There way too many things to consider in compiler optimization.

      Comment


      • #18
        It would be nice to have a database/list of programs and their fastest compile flags (depending on the compiler/version of course).

        Comment


        • #19
          Question is indeed if mesa is speed limiting step (aka bottleneck) in the whole system here. But it won't hurt to keep my Gentoo CFLAGS like they are. Mainly march set and -O2. In few cases I actually use -Os for VIA CPUs or AMD's old Geode LX. Few packages might dislike messing too much with CFLAGS though.

          Comment


          • #20
            My understanding is that right now the biggest bottleneck in the oss graphics stack is GEM/TTM. It needs replaced, but I don't think anybody has a good idea on what to replace it with.

            Comment


            • #21
              Originally posted by nej_simon View Post
              Then why not use something like -march=i686 -msse -msse2? That would enable gcc to use cmov and sse/sse2 instructions and the binaries would still run on a P4.
              The v2 patch now has these options, and will almost certainly get approved.

              -march=pentium4 -mtune=core2 -mfpmath=sse


              Actually that looks like a typo - the patch comments talk about sse2, but the patch itself just enables sse.
              Last edited by smitty3268; 01-28-2013, 11:09 PM.

              Comment


              • #22
                Originally posted by mark_ View Post
                ok, makes sense. But shouldn't the programmer use inline functions or macros in this case?
                I guess I will add the inline parameter to my CXXFLAGs and for single C packages.
                Function inlining varies a lot between software. In some cases, it gives huge speedups. Other times, it just results in slower performance and greater memory use. It can vary depending on how large your CPU cache is as well.

                You can even manually set the depth the compiler will inline down to - something Firefox does for example, because the default -O3 inlining was too much, but by limiting the inlining amount they could still turn on -O3 and get better results than plain old -O2.

                Comment


                • #23
                  Originally posted by Adarion View Post
                  Question is indeed if mesa is speed limiting step (aka bottleneck) in the whole system here. But it won't hurt to keep my Gentoo CFLAGS like they are. Mainly march set and -O2. In few cases I actually use -Os for VIA CPUs or AMD's old Geode LX. Few packages might dislike messing too much with CFLAGS though.
                  It's much more likely to be with faster GPUs and lower resolutions. Michael testing an IGP at 1080p probably isn't going to show a lot.

                  Comment


                  • #24
                    Originally posted by Lockal View Post
                    I guess the bottleneck of most videogames is not OpenGL, unless the game is designed for high-end graphics card. Check this with any profiler: gl... calls are almost unnoticeable amoung game physics and logic. Compiling the actual software and main libraries instead of driver could give a very different result.
                    Not in my experience. I've run a lot of benchmarks and games, and 'sysprof' often shows that _mesa_* calls (which are the actual implementation of the gl* calls) are a very noticable percentage.
                    Free Software Developer .:. Mesa and Xorg
                    Opinions expressed in these forum posts are my own.

                    Comment


                    • #25
                      I've always* built Mesa with -O3 and not once had an issue that was because of that.

                      * not built git in the last 3-4 months since it requires newer autofoo and I'm too lazy.

                      Comment

                      Working...
                      X