Announcement

Collapse
No announcement yet.

Greater Radeon Gallium3D Shader Optimization Tests

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Greater Radeon Gallium3D Shader Optimization Tests

    Phoronix: Greater Radeon Gallium3D Shader Optimization Tests

    After delivering preview benchmarks of the AMD Radeon Gallium3D driver's new shader optimization benchmark, Vadim Girlin, the back-end's author, has shared some complementary Linux OpenGL benchmark results...

    http://www.phoronix.com/vr.php?view=MTM2NzM

  • #2
    So, what does it mean? Did you do something wrong or optimizations apply only to his card?
    ## VGA ##
    AMD: X1950XTX, HD3870, HD5870
    Intel: GMA45, HD3000 (Core i5 2500K)

    Comment


    • #3
      Pretty sure there's something wrong with Michael's benchmarks, at least for the Unigine and Doom 3 tests. These are x86, and it very much looks like he didn't correctly compile/install the updated Mesa x86 libraries - results are identical. Yet again I have to wonder why Michael didn't investigate this at all.

      Comment


      • #4
        Originally posted by brent View Post
        Pretty sure there's something wrong with Michael's benchmarks, at least for the Unigine and Doom 3 tests. These are x86, and it very much looks like he didn't correctly compile/install the updated Mesa x86 libraries - results are identical. Yet again I have to wonder why Michael didn't investigate this at all.
        I think there will be no improvements with r600-sb for Doom 3 anyway, it uses simple shaders where it's hard to find something to optimize, it's like trying to optimize "Hello world" program.

        AFAIK catalyst uses optimizations related to texture formats ("Catalyst AI" or something like this, I didn't look into what exactly it does though), and this option doubled Doom3 performance with fglrx for me last time when I tested, r600g may need the same optimizations.

        Comment


        • #5
          Hmm, I'm not very familiar with Doom 3, but I was under the impression it used somewhat complex pixel shaders at least. AFAIK it was one of the first games to really push shader hardware, with high quality per-pixel lighting and shadows.

          Can you tell more about the optimizations Catalyst uses? Does it convert to a more favorable texture format under the hood or something like that?

          Comment


          • #6
            Originally posted by brent View Post
            Hmm, I'm not very familiar with Doom 3, but I was under the impression it used somewhat complex pixel shaders at least. AFAIK it was one of the first games to really push shader hardware, with high quality per-pixel lighting and shadows.
            Possibly Doom3 shaders were somewhat complex for hardware available in 2004, IIRC it was a time of Radeon X800 series (R4xx chips), but for newer hardware they are pretty trivial as compared e.g. to shaders used by Unigine demos. They are not even written in GLSL, it's ARB programs.

            Originally posted by brent View Post
            Can you tell more about the optimizations Catalyst uses? Does it convert to a more favorable texture format under the hood or something like that?
            As I said I didn't look into it, but so far I think it does something like that.
            Also there were some shader tweaks that were probably included as app-specific optimizations in the proprietary drivers, e.g. to use gpu math instead of texture lookup to get precomputed values: http://forum.beyond3d.com/showthread.php?t=12732

            And now we have Doom3 sources, so it might be easier and more efficient to optimize the game itself for modern hardware than to optimize the drivers for this game.

            Comment


            • #7
              First time since one year I've switched back to the FOSS drivers on my HD 5600 Mobility laptop, I enabled the shader backend optimizations with a script in .kde/env:
              Code:
              $ env|grep R600
              R600_DEBUG=sb
              and WOW! Don't Starve works better than with fglrx (no cloud glitch and perfectly smooth*), Portal works perfectly, and Stacking works BETTER than it used to on Windows when I played it one year ago, all of them are playable in highest settings with FXAA or MSAA with a good framerate.







              Only Surgeon Simulator 2013, a Unity 4-based game (http://www.youtube.com/watch?v=ZO10bAn_8M8) is glitchy, but it's already almost as fast as it was on fglrx (but not as fast as on Windows):



              So amazing job guys! What's missing now is proper power management, but I can live with switching between low and high profiles for a while.


              *: Actually Don't Starve is smoother than with fglrx during the first 3/4-1 hour but then like it happened with fglrx it becomes choppy (usually when my character loses his sanity), and the slowdown/choppiness is much more sensible than with fglrx. A game restart fixes it. BTW it's not a 2D game, it's 3D with a not so low polygon count (it's not just simple sprites) and shader-based effects.
              But overall it feels so much better because it doesn't have those "micro-freezes" fglrx has at all times, it's really perfectly smooth.
              Last edited by Azultra; 05-11-2013, 11:14 AM.

              Comment


              • #8
                Originally posted by Azultra View Post
                First time since one year I've switched back to the FOSS drivers on my HD 5600 Mobility laptop, I enabled the shader backend optimizations with a script in .kde/env:
                Code:
                $ env|grep R600
                R600_DEBUG=sb
                and WOW!

                ....

                So amazing job guys! What's missing now is proper power management, but I can live with switching between low and high profiles for a while.

                *: Actually Don't Starve is smoother than with fglrx during the first 3/4-1 hour but then like it happened with fglrx it becomes choppy (usually when my character loses his sanity), and the slowdown/choppiness is much more sensible than with fglrx. A game restart fixes it. BTW it's not a 2D game, it's 3D with a not so low polygon count (it's not just simple sprites) and shader-based effects.
                But overall it feels so much better because it doesn't have those "micro-freezes" fglrx has at all times, it's really perfectly smooth.
                Paypal: vadimgirlin at gmail dot com
                I will be donating a bit later this month, the guy clearly needs better GPU card xD.

                Vadim's patches are awesome, radeon starts to match and even outperform fglrx !

                The choppy issue you are having can probably be related to memory fragmentation - but within 3D game, its usually handled by game engine logic alone. Maybe Linux version is simply bugged..? Some tests on geforce might help confirm that.

                Thanks for heads up!
                Last edited by brosis; 05-11-2013, 02:22 PM.

                Comment


                • #9
                  Originally posted by Azultra View Post
                  Only Surgeon Simulator 2013, a Unity 4-based game (http://www.youtube.com/watch?v=ZO10bAn_8M8) is glitchy
                  As far as I can see it's a typical z-fighting issue, I can reproduce it with linux demo and I think it's not a bug in r600g, ideally the game developers should take care of that. By the way, it doesn't happen for me with windows version of the demo (and wine).

                  Comment


                  • #10
                    Originally posted by brosis View Post
                    Paypal: vadimgirlin at gmail dot com
                    I will be donating a bit later this month, the guy clearly needs better GPU card xD.
                    Could you point me out where Vadim mentioned his current config?

                    Clubbing together so that he could offer himself a good graphics card is the least we could do for his awesome work!


                    And out of curiosity could you or Vadim point me out to mails or blog posts that explain a bit how you managed to get such a leap in performance with shaders?


                    Anyway I wanted to see the drivers would stand the test of fire, doing some stuff I'd never consider doing with fglrx , i.e Steam Windows games run by Wine. Never done that before, but the Windows version of Steam actually works great with Wine. For the games themselves that's another matter

                    Oddworld: Stranger's Wrath was ported on PC with OpenGL, so it could potentially run perfectly on Linux, and it basically works:



                    ..but at 2 FPS. Is it falling back to software Mesa because of unsupported extensions?

                    Second game was Endless Space, a great 4X based on Unity 3, so it can be run in OpenGL mode with the -force-opengl falg on Windows but Unity 3 seems to do something with OpenGL contexts that is tolerated by WGL but not by GLX, so I couldn't get it to run at all. It launches in the default D3D9 mode but all I get is a black background with weirdly colored stuff in place of nebulae. Other people managed to get it working in D3D9:

                    http://appdb.winehq.org/objectManage...sion&iId=26773

                    so is it due to the drivers? But anyway the best thing would be if Wine could make it run in OpenGL mode, and the drivers have nothing to do with that.
                    Last edited by Azultra; 05-12-2013, 02:05 PM.

                    Comment


                    • #11
                      I managed to get Endless Space working in OpenGL mode (I described the procedure here: http://forums.amplitude-studios.com/...l=1#post138130) instead of the default D3D9 mode, it went from:



                      to



                      and it's working very nicely. The framerate is good, of course not as good as it was on Windows in D3D9 mode but it's already glitch-free and fully playable. So this is just awesome once again, can't believe I'm playing one of my favorite Windows games on the FOSS drivers!
                      Last edited by Azultra; 05-13-2013, 10:29 AM.

                      Comment


                      • #12
                        Originally posted by Azultra View Post
                        And out of curiosity could you or Vadim point me out to mails or blog posts that explain a bit how you managed to get such a leap in performance with shaders?
                        Basically r600-sb implements some well-known optimization algorithms used in many compilers. Default backend in the r600g doesn't optimize anything at all, so it's like compiling the shaders with -O0, while r600-sb tries to make it closer to -O2. It's not only the bytecode optimization, also there are hardware-specific things like the reduction of register usage. On r600 architecture the number of threads the hw can run simultaneously is limited by the number of GPRs required for each thread, because they are allocated from the common limited pool of register memory. Reducing that number often allows to run more threads, that is, e.g. more vertices/pixels can be processed in parallel resulting in better utilization of the hardware. One more example is reduction of the stack size requirement that also limits number of threads similarly to register limit, it's now implemented for the default backend as well improving its performance too, but r600-sb helped to spot this performance issue and in some cases it's able to further reduce stack usage as compared to the default backend.

                        Originally posted by Azultra View Post
                        Oddworld: Stranger's Wrath was ported on PC with OpenGL, so it could potentially run perfectly on Linux, and it basically works ..but at 2 FPS. Is it falling back to software Mesa because of unsupported extensions?
                        That's strange, if it's a 32-bit application I'd suspect something wrong with 32-bit drivers. Running it with LIBGL_DEBUG=verbose can provide some hints if something is going wrong. Another possible issue is repetitive shader recompilation, typically the shaders should be compiled once but in some cases it becomes a problem, in such case running the app with R600_DEBUG=sb,sbstat will show how many shaders are compiled and when (it will print optimization statistics for each compiled shader). Running the app with MESA_DEBUG=1 probably also can provide some information. Btw, what FPS do you have in this game with the same GPU and proprietary driver?

                        Comment


                        • #13
                          I can't get mesa to work in SB mode; it just segfaults with any application:

                          Code:
                          export R600_DEBUG=sb
                          dmesg: glxgears[9677]: segfault at 5c ip 00007f1fd805d7d3 sp 00007fff383ffd40 error 4 in r600g_dri.so[7f1fd7d10000+598000]
                          Run with "debug" useflag enabled (Gentoo user):
                          Code:
                          sb/sb_core.cpp:292:translate_chip: Assertion `!"unknown chip"' failed.
                          Trace/breakpoint trap
                          Maybe the unknown chip portion means something. I'm using a AMD Trinity laptop A10-4600M if that's helpful (northern islands using R600)

                          GDB backtrace with debug enabled:

                          Code:
                          #0  0x00007ffff41b690f in ?? () from /usr/lib64/dri/r600_dri.so
                          #1  0x00007ffff42ce323 in ?? () from /usr/lib64/dri/r600_dri.so
                          #2  0x00007ffff42cbeae in r600_sb_context_create(r600_context*) () from /usr/lib64/dri/r600_dri.so
                          #3  0x00007ffff42cc0e8 in r600_sb_bytecode_process () from /usr/lib64/dri/r600_dri.so
                          #4  0x00007ffff427e8c0 in ?? () from /usr/lib64/dri/r600_dri.so
                          #5  0x00007ffff42a9333 in ?? () from /usr/lib64/dri/r600_dri.so
                          #6  0x00007ffff42a9532 in ?? () from /usr/lib64/dri/r600_dri.so
                          #7  0x00007ffff42a959a in ?? () from /usr/lib64/dri/r600_dri.so
                          #8  0x00007ffff41abc61 in ?? () from /usr/lib64/dri/r600_dri.so
                          #9  0x00007ffff41ec2eb in ?? () from /usr/lib64/dri/r600_dri.so
                          #10 0x00007ffff41ec85a in ?? () from /usr/lib64/dri/r600_dri.so
                          #11 0x00007ffff41ec742 in ?? () from /usr/lib64/dri/r600_dri.so
                          #12 0x00007ffff41c15aa in ?? () from /usr/lib64/dri/r600_dri.so
                          #13 0x00007ffff4279daa in ?? () from /usr/lib64/dri/r600_dri.so
                          #14 0x00007ffff427b5c9 in ?? () from /usr/lib64/dri/r600_dri.so
                          #15 0x00007ffff3ea3665 in ?? () from /usr/lib64/dri/r600_dri.so
                          #16 0x00007ffff431ba24 in ?? () from /usr/lib64/dri/r600_dri.so
                          #17 0x00007ffff3ea41f8 in ?? () from /usr/lib64/dri/r600_dri.so
                          #18 0x00007ffff76e5cfc in ?? () from /usr/lib64/libGL.so.1
                          #19 0x00007ffff76ab1cf in ?? () from /usr/lib64/libGL.so.1
                          #20 0x00007ffff76ab5e7 in ?? () from /usr/lib64/libGL.so.1
                          #21 0x00007ffff76a5467 in ?? () from /usr/lib64/libGL.so.1
                          #22 0x00007ffff76a75fd in glXChooseVisual () from /usr/lib64/libGL.so.1
                          #23 0x0000000000403b9f in ?? ()
                          #24 0x0000000000401df4 in ?? ()
                          #25 0x00007ffff6abec15 in __libc_start_main () from /lib64/libc.so.6
                          #26 0x0000000000402751 in ?? ()
                          Backtrace without debug enabled:
                          Code:
                          #0  0x00007ffff44587d3 in r600_sb::bc_parser::parse_decls() () from /usr/lib64/dri/r600_dri.so
                          #1  0x00007ffff4459acd in r600_sb::bc_parser::parse_shader() () from /usr/lib64/dri/r600_dri.so
                          #2  0x00007ffff445a47d in r600_sb::bc_parser::parse() () from /usr/lib64/dri/r600_dri.so
                          #3  0x00007ffff445b81a in r600_sb_bytecode_process () from /usr/lib64/dri/r600_dri.so
                          #4  0x00007ffff4436d14 in ?? () from /usr/lib64/dri/r600_dri.so
                          #5  0x00007ffff44495b5 in ?? () from /usr/lib64/dri/r600_dri.so
                          #6  0x00007ffff44496ac in ?? () from /usr/lib64/dri/r600_dri.so
                          #7  0x00007ffff44496e1 in ?? () from /usr/lib64/dri/r600_dri.so
                          #8  0x00007ffff438d461 in ?? () from /usr/lib64/dri/r600_dri.so
                          #9  0x00007ffff43b21cb in ?? () from /usr/lib64/dri/r600_dri.so
                          #10 0x00007ffff43b2203 in ?? () from /usr/lib64/dri/r600_dri.so
                          #11 0x00007ffff439923f in ?? () from /usr/lib64/dri/r600_dri.so
                          #12 0x00007ffff4429836 in ?? () from /usr/lib64/dri/r600_dri.so
                          #13 0x00007ffff442a8e5 in ?? () from /usr/lib64/dri/r600_dri.so
                          #14 0x00007ffff41a8719 in ?? () from /usr/lib64/dri/r600_dri.so
                          #15 0x00007ffff4481e4b in ?? () from /usr/lib64/dri/r600_dri.so
                          #16 0x00007ffff41a95df in ?? () from /usr/lib64/dri/r600_dri.so
                          #17 0x00007ffff76e69f9 in ?? () from /usr/lib64/libGL.so.1
                          #18 0x00007ffff76c2d4a in ?? () from /usr/lib64/libGL.so.1
                          #19 0x00007ffff76bf5d2 in ?? () from /usr/lib64/libGL.so.1
                          #20 0x00007ffff76bfb68 in glXChooseVisual () from /usr/lib64/libGL.so.1
                          #21 0x0000000000403b9f in ?? ()
                          #22 0x0000000000401df4 in ?? ()
                          #23 0x00007ffff6ad8c15 in __libc_start_main () from /lib64/libc.so.6
                          #24 0x0000000000402751 in ?? ()



                          R600_DEBUG=sbdry works, so at least the optimising part works.

                          I've used safe cflags in case it's a CFLAG issue. I used much more aggressive ones before without issue with standard R600 or with the r600-llvm-compiler:

                          Code:
                          CFLAGS="-O1 -pipe -ggdb"
                          CXXFLAGS="-O1 -pipe -ggdb"

                          Comment


                          • #14
                            Originally posted by AnonymousCoward View Post
                            I can't get mesa to work in SB mode; it just segfaults with any application:
                            Just a missing case for ARUBA chips, fix pushed.

                            Comment


                            • #15
                              Originally posted by vadimg View Post
                              Just a missing case for ARUBA chips, fix pushed.
                              Thank you, that fixed it.

                              Comment

                              Working...
                              X