Announcement

Collapse
No announcement yet.

Mesa Threaded OpenGL Dispatch Finally Landing, Big Perf Win For Some Games

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by indepe View Post

    Well, in so far as described in the article, in theory this technique can be used with any API, single or multi-threaded. Or at any level of an application's internal or external call stack. However in the case of Vulkan, it would be less likely a meaningful way to split up the work onto multiple threads, since other options exist.

    Usually, I think, you would dispatch work on a higher level. To do it at the external API level is a way that can be implemented by a library below the application level, transparent to the application. In a sense, it is a substitute for multithreading within the application, and may in some cases work very well, and in others not so well (involving unnecessary copying of data, for example).

    Makes me wonder if it could be combined with GLVND to optionally benefit any/all OpenGL drivers on Linux.
    This has got nothing to do with anything else but GL specifically.

    Comment


    • #12
      Originally posted by funfunctor View Post

      This has got nothing to do with anything else but GL specifically.
      I know. What I am saying is that, in theory, instead of "Stash the GL calls in a batchbuffer" you could also "Stash the Vulkan calls in a batchbuffer" or "Stash the printf calls in a batchbuffer". Actually, the latter is something I was planning to do (or at least try out) the coming week, even though printf is not a single-threaded API. (Or at least the printf implementation which I am using for debugging purposes, doesn't mix output from concurrent calls.)

      Of course, you were correct to point out that the single-threaded nature of OpenGL does not generally apply to GPUs, but is specific to the OpenGL API itself.
      Last edited by indepe; 06 February 2017, 03:24 AM.

      Comment


      • #13
        i did some test, for many apps (csgo, glxgears, gputest) mesa_glthread does not work
        Code:
        _mesa_glthread_init
        _mesa_glthread_destroy
        talos result
        Talos Ultra ren_bMultiThreadedRendering=1 ren_bMultiThreadedRendering=0
        mesa_glthread=true 44,5\61,4\33,9 43,9\60,4\32,6
        mesa_glthread=false 45,5\62,1\33,3 43,7\59,9\32,4

        Comment


        • #14
          Originally posted by Mark Rose View Post
          I hope this helps with ETS2!
          I believe that game would benefit from more bit polishing from the developer. Even on Windows the diference between DirectX and OpenGL are huge.

          Comment


          • #15
            Originally posted by indepe View Post

            I know. What I am saying is that, in theory, instead of "Stash the GL calls in a batchbuffer" you could also "Stash the Vulkan calls in a batchbuffer" or "Stash the printf calls in a batchbuffer". Actually, the latter is something I was planning to do (or at least try out) the coming week, even though printf is not a single-threaded API. (Or at least the printf implementation which I am using for debugging purposes, doesn't mix output from concurrent calls.)

            Of course, you were correct to point out that the single-threaded nature of OpenGL does not generally apply to GPUs, but is specific to the OpenGL API itself.
            Ah I see where your miss understanding is coming from now. Right so, with Vk you don't need to do that as Vk already has things like command buffers where you can configure as many concurrent streams as you like and handle all that yourself. This is what is meant by Vk being "low level" in that you as the user then become responsible for setting up the threads, buffers and whatever else, packing them with data and sending them on their way. Hope this helps without being too technical, let me know if you still don't understand and I can explain more deeply..

            Comment


            • #16
              Originally posted by funfunctor View Post

              Ah I see where your miss understanding is coming from now. Right so, with Vk you don't need to do that as Vk already has things like command buffers where you can configure as many concurrent streams as you like and handle all that yourself. This is what is meant by Vk being "low level" in that you as the user then become responsible for setting up the threads, buffers and whatever else, packing them with data and sending them on their way. Hope this helps without being too technical, let me know if you still don't understand and I can explain more deeply..
              Not sure about what the misunderstanding is. I'd say with OpenGL this technique may make sense because the CPU overhead of API calls is high, so you might try to move this overhead to a different core. With Vulkan you have lower overhead (aside from that it has to compete with other options of making use of multiple cores).

              Comment


              • #17
                this is much more important than on-disk cache

                Comment


                • #18
                  Originally posted by pal666 View Post
                  this is much more important than on-disk cache
                  In my opinion, glthread (if enabled from command-line) will be slowing down a large number of OpenGL apps this year (2017) and this issue won't be resolved until year 2018+.

                  On-disk cache is much closer to being capable of working as expected/intended in year 2017 than glthread (mareko).

                  Comment


                  • #19
                    In many cases, it often seems that the open source AMD drivers are bottlenecked by the CPU. I could see how this feature may substantially improve performance, particularly in games that use post-processing.

                    Originally posted by atomsymbol View Post
                    In my opinion, glthread (if enabled from command-line) will be slowing down a large number of OpenGL apps this year (2017) and this issue won't be resolved until year 2018+.

                    On-disk cache is much closer to being capable of working as expected/intended in year 2017 than glthread (mareko).
                    Though I agree that ODC is much closer to being readily usable in 2017, I don't really understand why glthread would slow down many GL apps. I could understand it not having any impact on some games, or maybe slowing things down depending on the CPU used, but not so much depending on the app itself.

                    Comment


                    • #20
                      Originally posted by schmidtbag View Post
                      Though I agree that ODC is much closer to being readily usable in 2017, I don't really understand why glthread would slow down many GL apps. I could understand it not having any impact on some games, or maybe slowing things down depending on the CPU used, but not so much depending on the app itself.
                      Well. I compiled https://cgit.freedesktop.org/~mareko/mesa/?h=glthread, run a game and observed a performance decrease by up to 60%. I am not claiming that glthread doesn't benefit some other games.

                      In general, the only thing that decides whether multi-threaded code performing a task is faster than single-threaded code performing the same task is whether it is computable beforehand/in_advance that the former is faster than the latter. If it cannot be computed it is faster it may just as well be slower.

                      Comment

                      Working...
                      X