Announcement

Collapse
No announcement yet.

RadeonSI With OpenGL 4 Showing Nice Performance Against Catalyst

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #41
    This is all fine and dandy but meanwhile R600 is still stuck in OpenGL 3.3 and that makes me unhappy...

    Bunch of AMD dGPUs and iGPUs (i.e. AMD Trinity and Richland) that are dirty cheap are still stuck in OpenGL 3.3 if we want to use OSS drivers...

    Any hopes that can change in a not so far away time frame ?

    Comment


    • #42
      Originally posted by dungeon View Post
      You know, try mesa 10.6 branch
      Well, that only works with llvm 3.5. The results are interesting:
      https://imgur.com/a/GrwtL (Different maps, but a few general observations are still possible)

      not sure, in what units the buffer wait time is measured here. (Maybe I should cherry pick the gallium hud improvements from recently.) M May be milliseconds and k may be ns... If that's true, then that's a lot better. GTT usage is low like with new mesa and nine. Interesting. GPU usage is a lot better. Performance is still not constant 60+ fps, but then it's llvm 3.5.

      Comment


      • #43
        10.6.5 should work with llvm 3.6.2, at least it works here - just tried.

        After that maybe llvm scheduler is problem for you so try disabling it, or some regression in mesa. Some tips from me, this one slow down nearly everything (even glamor so desktop became a bit slugish) by around 5% for me:

        http://cgit.freedesktop.org/mesa/mes...e1595418c6cea3

        And on this one i have big slow downs in Doom3 BFG, perf nearly halfed. Perfect threaded game, just became capped on that one:

        http://cgit.freedesktop.org/mesa/mes...b952f475dfb444

        So this seems to me that whatever devs increase something, things became slower... same like increasing pipe value so that Bioshock start to work, well nearly everything became a bit slower too
        Last edited by dungeon; 23 August 2015, 08:15 AM.

        Comment


        • #44
          Originally posted by vadimg View Post
          Hi, Marek!
          But what is a problem with shader variants? They can be exposed as a single binary, in theory. I.e. binary would contain all compiled variants and the keys. Is it problematic for some reason? Other than it's not implemented yet?
          The problem is there are trillions of possible shared variants per shader. We could return some, but not all.

          Comment


          • #45
            Originally posted by vadimg View Post
            I'm not working on the drivers for a long time since I've got proprietary job, but I'm still trying to use them sometimes for the games, and then I'm hitting freezes due to the long compilation time. I'm kinda solved it for me (or at least reduced) for r600g with custom backend, but I'm not ready yet to solve it the same way for radeonsi.
            Marek, do you think it's possible to solve it for radeonsi with the current backend? AFAIK catalyst uses LLVM as well (on windows too?), but there are are no such issues, and AFAICS it doesn't use any caches, just a multithreaded compilation. 
            Catalyst doesn't use llvm for OpenGL at least. Once we reduce the number of shader variants, it will be a lot easier to add support for the shader cache.

            Comment


            • #46
              Originally posted by haagch View Post
              So comparison with more HUD settings. Not very scientific, because I simply played without reproducing anything, but to get an idea.

              Shoots native: https://i.imgur.com/AbMkUbr.png
              Shoots nine: https://i.imgur.com/rRcUIsX.png

              Lake native: https://i.imgur.com/mJ9G7TW.png
              Lake nine: https://i.imgur.com/gRLjAai.png

              I haven't calculated the percentages, but from the looks of it the better performance could probably to large parts be simply scaled with better GPU load.
              What exactly does the buffer wait time show? Because that one is a lot larger on opengl native.
              With nine it also uses a lot less GTT and quite some bit less VRAM. Hm...
              Buffer wait time means how long the CPU has waited for the GPU during CPU-GPU synchronizations. If it's nonzero all the time, the performance will suck. This can be the primary cause of the slowness.

              Comment


              • #47
                For me llvm 3.6.2 + mesa 10.6 didn't compile because of T->createMCInstPrinter(AsmPrinterVariant, *AsmInfo, *MII, *MRI, *STI)); in gallivm/lp_bld_debug.cpp (I think).

                llvm scheduler requires recompilation, right? That always takes sooo much time.....


                That sucks to revert. So I first tested the other one alone:


                Reverting this one definitely improves matters noticeably, makes it a lot more smooth: https://imgur.com/a/kmMKD
                GPU usage goes up a few percent, fps go up a bit, and buffer wait time improves a lot. This is also the commit that caused the high GTT usage. (I'm currently using 4.2.0-rc7-mainline)

                Thanks for the short explanation, marek. If M really means milliseconds in mesa 10.6, then something happened since mesa 10.6/llvm 3.5. I'm going to try mesa master with llvm 3.5 and see whether it's llvm or mesa and if it's mesa, if I can bisect a bit.

                Last edited by haagch; 23 August 2015, 11:22 AM.

                Comment


                • #48
                  Would suck to have nvidia and see how much progression happens in open source AMD driver.

                  Comment


                  • #49
                    Originally posted by moilami View Post
                    Would suck to have nvidia and see how much progression happens in open source AMD driver.
                    Only that Bioshock already runs around 40% faster with Catalyst 15.7 if Michael know how to run it

                    Comment


                    • #50
                      Originally posted by haagch View Post
                      Reverting this one definitely improves matters noticeably, makes it a lot more smooth: https://imgur.com/a/kmMKD
                      GPU usage goes up a few percent, fps go up a bit, and buffer wait time improves a lot. This is also the commit that caused the high GTT usage. (I'm currently using 4.2.0-rc7-mainline)
                      Thank you very much for testing. I will revert that commit. I can't imagine what we would do without you guys testing every commit.

                      I'll see what I can do with the viewport array perf regression next week.

                      Comment

                      Working...
                      X