Announcement

Collapse
No announcement yet.

Using Multiple Threads With The Vulkan API

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using Multiple Threads With The Vulkan API

    Phoronix: Using Multiple Threads With The Vulkan API

    Tobias Hector has written an insightful blog post about scaling Vulkan to multiple CPU threads...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    Did he mention the ability to use multiple cpus across vendors? Must be fast to use my gtx770 simultaneously together with my integrated intel graphics.

    Comment


    • #3
      Originally posted by mike4 View Post
      Did he mention the ability to use multiple cpus across vendors? Must be fast to use my gtx770 simultaneously together with my integrated intel graphics.
      He's talking about CPU multithreading, not GPU crossovers across vendors... Completely unrelated, and definitely not the same thing as using multiple cpus across vendors, which I'm pretty sure Vulkan doesn't support (Damn... the implications of doing that... luckily most people only have one CPU in their computers, it's only servers that have more)

      In case you haven't been keeping track of things, DirectX 11 and lower, as well as OpenGL are T.E.R.R.I.B.L.E. at multithreading, surely you've heard the reports of Nvidia cards performing worse when mlutithreading is turned on (that's why, OpenGL and DX11 both just being old and designed initially with unicore processors in mind) which is why (although not everyone knows this) whichever CPU that provides the best per-thread performance is the best CPU for gaming. Vulkan fixes that allowing for better multithreading support, like actually working multithreading support. We've been using graphical APIs made for hardware that's been outdated for more than a century for too long.

      I wonder why I bothered explaining this, if you'd just read that stuff on imagination tech's site that Michael posted in the article, you would know this...

      This here screenshot shows you in fact exactly what I'm talking about (OpenGL ES on the left, Vulkan right, as you can see OpenGL only uses two cores (and only one of them at a time) meanwhile Vulkan uses all the cores available, and you can also see the proportional FPS increase because of this. This is the most important aspect of Vulkan and DX12, it's not that they're low level, it's that they're multithreading friendly.

      http://blog.imgtec.com/wp-content/up...h_es_vs_vk.png
      Last edited by rabcor; 25 November 2015, 12:09 PM.

      Comment


      • #4
        Originally posted by mike4 View Post
        Did he mention the ability to use multiple cpus across vendors? Must be fast to use my gtx770 simultaneously together with my integrated intel graphics.
        Even if is possible is not a great idea unless you specifically want each to do unrelated work, simply because memory latency will murder your FPS in cold blood otherwise.

        This could be great in case of streaming and image filtering, AKA your big GPU deal with the game while your IGP deal with GEM/DMA_BUF containing the frames to do encoding and post processing in realtime using compute code.

        Comment


        • #5
          I don't think that the APIs are the only ones to blame. Some engines do better than others with multicore CPUs. CryEngine for example is very good at taxing all the cores of my FX8350, when other engines only touch 2 or 4 cores.

          Then you read a commentary from Matt Wagner, the head behind the simulators Lock-On, Flaming Cliffs and the most recent DCS World, a game that charges you between $30 to $60 dollars for each DLC (its free and came with only 2 or 3 planes), that their patched to death engine (it is around since the early 2000's) will not have multithread because it will only get marginal performance gains... Is like saying that a modern RTS game gains nothing having multithread support. Better he come out and say that his company did not have the money to rewrite their engine to take advantage of modern CPUs.

          Comment


          • #6
            Originally posted by M@GOid View Post
            I don't think that the APIs are the only ones to blame. Some engines do better than others with multicore CPUs. CryEngine for example is very good at taxing all the cores of my FX8350, when other engines only touch 2 or 4 cores.

            Then you read a commentary from Matt Wagner, the head behind the simulators Lock-On, Flaming Cliffs and the most recent DCS World, a game that charges you between $30 to $60 dollars for each DLC (its free and came with only 2 or 3 planes), that their patched to death engine (it is around since the early 2000's) will not have multithread because it will only get marginal performance gains... Is like saying that a modern RTS game gains nothing having multithread support. Better he come out and say that his company did not have the money to rewrite their engine to take advantage of modern CPUs.
            to be honest, multithreading and OpenGL/DX is quite a conceptual stretch because is more about design techniques trying to stuff as much state unrelated operations as possible in different cores than actual multithreading, Vulkan on the other hand is made for multithreading, ease to vectorise and cache friendliness by design(as far as the public tidbits about it imply)



            Comment


            • #7
              Originally posted by M@GOid View Post
              I don't think that the APIs are the only ones to blame. Some engines do better than others with multicore CPUs. CryEngine for example is very good at taxing all the cores of my FX8350, when other engines only touch 2 or 4 cores.

              Then you read a commentary from Matt Wagner, the head behind the simulators Lock-On, Flaming Cliffs and the most recent DCS World, a game that charges you between $30 to $60 dollars for each DLC (its free and came with only 2 or 3 planes), that their patched to death engine (it is around since the early 2000's) will not have multithread because it will only get marginal performance gains... Is like saying that a modern RTS game gains nothing having multithread support. Better he come out and say that his company did not have the money to rewrite their engine to take advantage of modern CPUs.
              There are workarounds to the problem of lacking multithreading support in OpenGL and DirectX (there are ALWAYS workarounds) but that is just what they are. Workarounds. And very difficult ones to perform at that. The way I picture it you can force the code on different threads through low level engine code. Like running several different chunks of the graphical code on different threads if these threads are available, but this means you're severely limited in how many threads you can support, because you have to chop down your graphics code into as many chunks as cores you want to be able to support, and that's the easy part, making these chunked codes then work together under the same application is the real headache, it's like you have a different program for each chunk of code and you have to bind them all together in your core code, the non graphics code that is. I can't even begin to imagine how hard that is. It's possible. It'll do the job, but it's an extremely limited approach. Only one of a couple approaches though, but I doubt there are any easier ones.

              Then again, you can never be sure that the graphics code is actually being used on all these cores, game engines aren't only graphics code, they have physics calculations and whatnot that can be done through your every day progrmaming languages like C++ where splitting the work across cores can be relatively painless. Could very well be that Cryengine has optimized to cryteks greatest ability the engine's code for multithreading, all the parts except for the graphics code. Not pretending to know, but that's one possibility. (For example in the gnome vulkan demo of imagine which was relatively simple, you could see one CPU on a low load and one CPU on a high load and 2 cpus untouched, this could have for example been that all the openGL code was running on core 2 and everything else on core 1)

              Comment


              • #8
                Originally posted by rabcor View Post

                There are workarounds to the problem of lacking multithreading support in OpenGL and DirectX (there are ALWAYS workarounds) but that is just what they are. Workarounds. And very difficult ones to perform at that. The way I picture it you can force the code on different threads through low level engine code. Like running several different chunks of the graphical code on different threads if these threads are available, but this means you're severely limited in how many threads you can support, because you have to chop down your graphics code into as many chunks as cores you want to be able to support, and that's the easy part, making these chunked codes then work together under the same application is the real headache, it's like you have a different program for each chunk of code and you have to bind them all together in your core code, the non graphics code that is. I can't even begin to imagine how hard that is. It's possible. It'll do the job, but it's an extremely limited approach. Only one of a couple approaches though, but I doubt there are any easier ones.

                Then again, you can never be sure that the graphics code is actually being used on all these cores, game engines aren't only graphics code, they have physics calculations and whatnot that can be done through your every day progrmaming languages like C++ where splitting the work across cores can be relatively painless. Could very well be that Cryengine has optimized to cryteks greatest ability the engine's code for multithreading, all the parts except for the graphics code. Not pretending to know, but that's one possibility. (For example in the gnome vulkan demo of imagine which was relatively simple, you could see one CPU on a low load and one CPU on a high load and 2 cpus untouched, this could have for example been that all the openGL code was running on core 2 and everything else on core 1)


                Yes, the rope is that Vulkan and DX12 will bring the game engines to use all the power available in today hardware with minimal barriers.

                The CryEngine got a huge bump in performance from version 2 to version 3. In Crysis 1, only 2 cores are utilized, but Crysis 3, with much better graphics, is actually easier on the hardware, with better fps in the same hardware. CryEngine 3 got multithread so right that it make the impossible possible: run Crysis 1 in the PS3 and Xbox 360. With simplified graphics of course, butt still very faithful to the PC version.

                Comment


                • #9
                  Well, there are a few things that can be done in OpenGL applications even without using multiple contexts and stuff:
                  - Using DSA (core in GL 4.5), which avoids a hell lot of slow glBind* calls. It's a relatively new feature, but pretty much every driver supports it by now.
                  - Using persistent mapping (GL_ARB_buffer_storage, core in GL 4.4) if available. That way, buffer uploads and downloads can be performed in a dedicated thread.
                  - Not using OpenGL for anything that doesn't require the GPU, like texture format conversion or compression.
                  - Simulating command buffers by submitting closures containing OpenGL commands to a queue so that the main thread can execute those OpenGL commands without any significant overhead.
                  - Using MultiDrawIndirect to perform as many draws per draw call as possible.

                  I'm currently experimenting with exactly that - unfortunately, rendering is delayed by two full frames (one for generating the command buffers and creating new OpenGL objects, another one for buffer data transfers) and it's certainly not easy to implement, but it does work. Don't know how well it will actually scale when rendering lots of stuff, though, since I can currently only render a GUI.


                  Can't wait to try Vulkan, which is (probably) going to be so much easier to use in that regard.

                  Comment


                  • #10
                    I wonder what did Command Buffer address in response to OpenGL. Correct me if I am wrong. With OpenGL, you would set states one by one like the draw mode, depth buffer. However with command buffer, I assume these commands can directly be compiled all at once in a single frame into GPU assembly language and sent only once to avoid having being sent every time?

                    Comment

                    Working...
                    X