Announcement

Collapse
No announcement yet.

R300,R400,R500,R600,R700 and more performance patch

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • R300,R400,R500,R600,R700 and more performance patch

    Hi all,

    I need some testers. I managed to hack/fix the _mesa_remove_extra_moves function in src/mesa/shader/prog_optimize.c to a usable state. As far as I could tell with my testing there was an issue with this optimizing pass and OPCODE_MUL . I just added an exception to for this one instruction and made it easy to add others should further testing indicate they need to be added too.

    It bumped my Nexuiz scores on demo1 from 5,8,and 12 to 5,9, and 13. It
    also reduced the testing runtime from 234 seconds to 225 seconds.


    I have only tested on my radeon hd 3100 based laptop but would like to
    hear results from other types of cards too. I am looking for any problems it might cause as well as benchmarks.

    I have one very favorable result from an r500 user with q3a.

    The other report I have is from an r280 user who reported it caused no additional problems for him

    It should apply to mesa 7.8 as well as current git master.



  • #2
    I might add that this patch should work on all chips that use GLSL (GL Shader Language) in mesa. So maybe intel and nvidia might get a boost too.

    I have tested Torcs, Nexuiz, Foobillard, and Celestia with no ill effects that I can see. I did have one machine lock up with Stormbaan Coureur but I have had lockups with no patches applied as well.

    Comment


    • #3
      Originally posted by Obscene_CNN View Post
      It bumped my Nexuiz scores on demo1 from 5,8,and 12 to 5,9, and 13.
      Such small increase can be considered a statistical error. The real bottleneck is elsewhere. You could easily achieve 2x speedup if you concentrated on real problems. If you think your patch is useful, take it to ML.

      Comment


      • #4
        marek,

        Its not a statistical error and it is on the dri-devel mailing list (I am awaiting my confirmation email to the Mesa3d-dev mailing list). Its about a 4% improvement. Modifications I have done that have gotten a 5 to 10% increase in performance in torcs have failed to change my Nexuiz performance by 1 second or the FPS at all.

        I didn't write the function, Brian Paul who started Mesa did. Apparently he thought it was worthwhile to write it. I just found and worked around a bug I found in it to make it usable.

        Comment


        • #5
          Originally posted by marek View Post
          Such small increase can be considered a statistical error. The real bottleneck is elsewhere. You could easily achieve 2x speedup if you concentrated on real problems. If you think your patch is useful, take it to ML.
          1st test! And later complain.

          This patch caused way better performance in q3a on mine r500(FireGL V5200) in UXGA mode(T60p). And not only FPS were problem, which is 25% lower in KMS comparing to UMS, but frame flow is lagfgy(not constant, accumulated and thrown at once). Try using some high-res like me and see difference, its real. In Warsow too and UT2K4, but tc-elite still lags at this res :/

          Comment


          • #6
            Obscene_CNN I would love to try your patch, but I am using fglrx right now (due to work). But I am sure many is appreciating your effort in optimizing the drivers.

            BTW. Is there any way of profiling the drivers?

            Comment


            • #7
              So I just gave it a quick test with Nexuiz/PTS and r300/r300g (my card is a X1900XT, so r500).
              r300g (HDR on): Without your Patch 30.31, with it 30.03
              r300 (HDR off): Without your Patch 40.88, with it 40.79
              So I can't see any speedup, if anything it's slightly slower, but that might very well be random derivation.

              Comment


              • #8
                Hans,

                You can profile the drivers. I haven't yet. Its a little difficult as you have to profile the kernel dri driver, libdrm, Mesa, and the application.

                Comment


                • #9
                  Zhick,

                  thanks for your testing. I'm trying to figure out how it ended up slower for you.

                  One contributing factor is that there is one more pass of optimization of the GPU instructions that would incur more CPU overhead. However it should not be that much.

                  Comment


                  • #10
                    Zhick,

                    The best I can figure is your performance is not GPU limited. It is CPU or DMA limited. I will try and come up with an additional patch for you to try to verify this.

                    Do you use an x86-64 distribution by chance?

                    Comment

                    Working...
                    X