Announcement

Collapse
No announcement yet.

Greater Radeon Gallium3D Shader Optimization Tests

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    I managed to get Endless Space working in OpenGL mode (I described the procedure here: http://forums.amplitude-studios.com/...l=1#post138130) instead of the default D3D9 mode, it went from:



    to



    and it's working very nicely. The framerate is good, of course not as good as it was on Windows in D3D9 mode but it's already glitch-free and fully playable. So this is just awesome once again, can't believe I'm playing one of my favorite Windows games on the FOSS drivers!
    Last edited by Azultra; 13 May 2013, 10:29 AM.

    Comment


    • #12
      Originally posted by Azultra View Post
      And out of curiosity could you or Vadim point me out to mails or blog posts that explain a bit how you managed to get such a leap in performance with shaders?
      Basically r600-sb implements some well-known optimization algorithms used in many compilers. Default backend in the r600g doesn't optimize anything at all, so it's like compiling the shaders with -O0, while r600-sb tries to make it closer to -O2. It's not only the bytecode optimization, also there are hardware-specific things like the reduction of register usage. On r600 architecture the number of threads the hw can run simultaneously is limited by the number of GPRs required for each thread, because they are allocated from the common limited pool of register memory. Reducing that number often allows to run more threads, that is, e.g. more vertices/pixels can be processed in parallel resulting in better utilization of the hardware. One more example is reduction of the stack size requirement that also limits number of threads similarly to register limit, it's now implemented for the default backend as well improving its performance too, but r600-sb helped to spot this performance issue and in some cases it's able to further reduce stack usage as compared to the default backend.

      Originally posted by Azultra View Post
      Oddworld: Stranger's Wrath was ported on PC with OpenGL, so it could potentially run perfectly on Linux, and it basically works ..but at 2 FPS. Is it falling back to software Mesa because of unsupported extensions?
      That's strange, if it's a 32-bit application I'd suspect something wrong with 32-bit drivers. Running it with LIBGL_DEBUG=verbose can provide some hints if something is going wrong. Another possible issue is repetitive shader recompilation, typically the shaders should be compiled once but in some cases it becomes a problem, in such case running the app with R600_DEBUG=sb,sbstat will show how many shaders are compiled and when (it will print optimization statistics for each compiled shader). Running the app with MESA_DEBUG=1 probably also can provide some information. Btw, what FPS do you have in this game with the same GPU and proprietary driver?

      Comment


      • #13
        I can't get mesa to work in SB mode; it just segfaults with any application:

        Code:
        export R600_DEBUG=sb
        dmesg: glxgears[9677]: segfault at 5c ip 00007f1fd805d7d3 sp 00007fff383ffd40 error 4 in r600g_dri.so[7f1fd7d10000+598000]
        Run with "debug" useflag enabled (Gentoo user):
        Code:
        sb/sb_core.cpp:292:translate_chip: Assertion `!"unknown chip"' failed.
        Trace/breakpoint trap
        Maybe the unknown chip portion means something. I'm using a AMD Trinity laptop A10-4600M if that's helpful (northern islands using R600)

        GDB backtrace with debug enabled:

        Code:
        #0  0x00007ffff41b690f in ?? () from /usr/lib64/dri/r600_dri.so
        #1  0x00007ffff42ce323 in ?? () from /usr/lib64/dri/r600_dri.so
        #2  0x00007ffff42cbeae in r600_sb_context_create(r600_context*) () from /usr/lib64/dri/r600_dri.so
        #3  0x00007ffff42cc0e8 in r600_sb_bytecode_process () from /usr/lib64/dri/r600_dri.so
        #4  0x00007ffff427e8c0 in ?? () from /usr/lib64/dri/r600_dri.so
        #5  0x00007ffff42a9333 in ?? () from /usr/lib64/dri/r600_dri.so
        #6  0x00007ffff42a9532 in ?? () from /usr/lib64/dri/r600_dri.so
        #7  0x00007ffff42a959a in ?? () from /usr/lib64/dri/r600_dri.so
        #8  0x00007ffff41abc61 in ?? () from /usr/lib64/dri/r600_dri.so
        #9  0x00007ffff41ec2eb in ?? () from /usr/lib64/dri/r600_dri.so
        #10 0x00007ffff41ec85a in ?? () from /usr/lib64/dri/r600_dri.so
        #11 0x00007ffff41ec742 in ?? () from /usr/lib64/dri/r600_dri.so
        #12 0x00007ffff41c15aa in ?? () from /usr/lib64/dri/r600_dri.so
        #13 0x00007ffff4279daa in ?? () from /usr/lib64/dri/r600_dri.so
        #14 0x00007ffff427b5c9 in ?? () from /usr/lib64/dri/r600_dri.so
        #15 0x00007ffff3ea3665 in ?? () from /usr/lib64/dri/r600_dri.so
        #16 0x00007ffff431ba24 in ?? () from /usr/lib64/dri/r600_dri.so
        #17 0x00007ffff3ea41f8 in ?? () from /usr/lib64/dri/r600_dri.so
        #18 0x00007ffff76e5cfc in ?? () from /usr/lib64/libGL.so.1
        #19 0x00007ffff76ab1cf in ?? () from /usr/lib64/libGL.so.1
        #20 0x00007ffff76ab5e7 in ?? () from /usr/lib64/libGL.so.1
        #21 0x00007ffff76a5467 in ?? () from /usr/lib64/libGL.so.1
        #22 0x00007ffff76a75fd in glXChooseVisual () from /usr/lib64/libGL.so.1
        #23 0x0000000000403b9f in ?? ()
        #24 0x0000000000401df4 in ?? ()
        #25 0x00007ffff6abec15 in __libc_start_main () from /lib64/libc.so.6
        #26 0x0000000000402751 in ?? ()
        Backtrace without debug enabled:
        Code:
        #0  0x00007ffff44587d3 in r600_sb::bc_parser::parse_decls() () from /usr/lib64/dri/r600_dri.so
        #1  0x00007ffff4459acd in r600_sb::bc_parser::parse_shader() () from /usr/lib64/dri/r600_dri.so
        #2  0x00007ffff445a47d in r600_sb::bc_parser::parse() () from /usr/lib64/dri/r600_dri.so
        #3  0x00007ffff445b81a in r600_sb_bytecode_process () from /usr/lib64/dri/r600_dri.so
        #4  0x00007ffff4436d14 in ?? () from /usr/lib64/dri/r600_dri.so
        #5  0x00007ffff44495b5 in ?? () from /usr/lib64/dri/r600_dri.so
        #6  0x00007ffff44496ac in ?? () from /usr/lib64/dri/r600_dri.so
        #7  0x00007ffff44496e1 in ?? () from /usr/lib64/dri/r600_dri.so
        #8  0x00007ffff438d461 in ?? () from /usr/lib64/dri/r600_dri.so
        #9  0x00007ffff43b21cb in ?? () from /usr/lib64/dri/r600_dri.so
        #10 0x00007ffff43b2203 in ?? () from /usr/lib64/dri/r600_dri.so
        #11 0x00007ffff439923f in ?? () from /usr/lib64/dri/r600_dri.so
        #12 0x00007ffff4429836 in ?? () from /usr/lib64/dri/r600_dri.so
        #13 0x00007ffff442a8e5 in ?? () from /usr/lib64/dri/r600_dri.so
        #14 0x00007ffff41a8719 in ?? () from /usr/lib64/dri/r600_dri.so
        #15 0x00007ffff4481e4b in ?? () from /usr/lib64/dri/r600_dri.so
        #16 0x00007ffff41a95df in ?? () from /usr/lib64/dri/r600_dri.so
        #17 0x00007ffff76e69f9 in ?? () from /usr/lib64/libGL.so.1
        #18 0x00007ffff76c2d4a in ?? () from /usr/lib64/libGL.so.1
        #19 0x00007ffff76bf5d2 in ?? () from /usr/lib64/libGL.so.1
        #20 0x00007ffff76bfb68 in glXChooseVisual () from /usr/lib64/libGL.so.1
        #21 0x0000000000403b9f in ?? ()
        #22 0x0000000000401df4 in ?? ()
        #23 0x00007ffff6ad8c15 in __libc_start_main () from /lib64/libc.so.6
        #24 0x0000000000402751 in ?? ()



        R600_DEBUG=sbdry works, so at least the optimising part works.

        I've used safe cflags in case it's a CFLAG issue. I used much more aggressive ones before without issue with standard R600 or with the r600-llvm-compiler:

        Code:
        CFLAGS="-O1 -pipe -ggdb"
        CXXFLAGS="-O1 -pipe -ggdb"

        Comment


        • #14
          Originally posted by AnonymousCoward View Post
          I can't get mesa to work in SB mode; it just segfaults with any application:
          Just a missing case for ARUBA chips, fix pushed.

          Comment


          • #15
            Originally posted by vadimg View Post
            Just a missing case for ARUBA chips, fix pushed.
            Thank you, that fixed it.

            Comment


            • #16
              Originally posted by brosis View Post
              Paypal: vadimgirlin at gmail dot com
              I will be donating a bit later this month, the guy clearly needs better GPU card xD.

              Vadim's patches are awesome, radeon starts to match and even outperform fglrx !
              Hmmm, maybe we should be donating towards an even worse card if its optimisations we are after

              Disclaimer: The above comment is a joke any response that fails to recognise this will be ignored by its author.

              Comment


              • #17
                I've been running the video stress test with Half Life 2: Lost Coast and am getting some improvement with the SB backend, using my A10-4600M APU. These are an average of 3 replications for each configuration to minimise variance:

                Default backend: 21.62

                LLVM backend: 23.99 fps (graphical glitches for the helicopter) = 11% improvement

                SB backend: 26.31 = 21% improvement

                SB + LLVM: 25.74 = 19% improvement

                Windows 8: 88.52 fps = 409% improvement

                Well, some nice improvements for both of the alternative backends. I also found that the default backend would randomly run much slower than normal, resulting in about 18 fps instead of 21 or so; I removed them from the average.

                Unfortunately, these results show that the open source drivers are still a lot slower than Windows, although perhaps this is a weakness with the Linux port specifically.

                I also tried to run the bench with FGLRX, but I couldn't get it to work. I just get a black screen when starting the game. Probably a configuration issue on my end, though. Portal doesn't work with FGLRX either but I had it working before.

                Comment


                • #18
                  Originally posted by AnonymousCoward View Post
                  I've been running the video stress test with Half Life 2: Lost Coast and am getting some improvement with the SB backend, using my A10-4600M APU. These are an average of 3 replications for each configuration to minimise variance:

                  Default backend: 21.62
                  ...
                  SB backend: 26.31

                  Windows 8: 88.52
                  I suggest some of the used functionality is still falling back to CPU. Would surely be nice if we could trace it :/
                  Maybe it would be a good idea to test an array of applications (depending on GL level) on both platforms and report the deficiencies found.

                  Comment


                  • #19
                    Do you use Wine to run HL2 Lost Coast ? (HL2 is ported to Linux, but I'm not sure for Lost Coast)

                    Comment


                    • #20
                      Originally posted by vljn View Post
                      Do you use Wine to run HL2 Lost Coast ? (HL2 is ported to Linux, but I'm not sure for Lost Coast)
                      No, I'm running the native port. I may well test the Wine version to compare, though.

                      Comment

                      Working...
                      X