Originally posted by JacekJagosz
View Post
Announcement
Collapse
No announcement yet.
RADV Exploring "A Driver On The GPU" In Moving More Vulkan Tasks To The GPU
Collapse
X
-
- Likes 4
-
Originally posted by jabl View Post
I'm not that familiar with AMD GPU's so I'm just speculating here, but I'd say it's not a "hardware scheduler" in the sense that the scheduler would be implemented in hardcoded ASIC logic. There is probably a tiny management CPU (say, an ARM core) on the GPU that handles all kinds of management tasks, including running the "hardware scheduler". And the scheduler code is thus part of the GPU firmware.
Further, I'd speculate that lower end GPU's cut out execution units, not the tiny management core since that one is needed anyway for all kinds of stuff. That management core is probably so tiny that it's not worth the effort to have several different management cores depending on how beefy the GPU is. So whether the hardware scheduler is missing or not is probably more of a generation thing than something dependent on the beefyness of the GPU.
Leave a comment:
-
Further, I'd speculate that lower end GPU's cut out execution units, not the tiny management core since that one is needed anyway for all kinds of stuff. That management core is probably so tiny that it's not worth the effort to have several different management cores depending on how beefy the GPU is. So whether the hardware scheduler is missing or not is probably more of a generation thing than something dependent on the beefyness of the GPU.
- Likes 1
Leave a comment:
-
agd5f Thank you for such a detailed anwser! So I guess when a hardware scheduler is missing (like in APUs which cut it to save die space), it is falling back to the scheduler in the kernel driver?
Leave a comment:
-
Originally posted by JacekJagosz View PostHow does this "Driver on the GPU" relate to Hardware Accelerated GPU Scheduling on Windows? And there has been a discussion on Windows that many newer AMD GPUs have a Hardware Scheduler, which Nvidia got rid of in their newer GPUs, and that means Nvidia has higher CPU overhead than AMD in DX12 for this reason? Here is a video about it that Hardware Unboxed made, and I was always wondering if this hardware scheduler in AMD GPUs has been used at all.
- Likes 3
Leave a comment:
-
How does this "Driver on the GPU" relate to Hardware Accelerated GPU Scheduling on Windows? And there has been a discussion on Windows that many newer AMD GPUs have a Hardware Scheduler, which Nvidia got rid of in their newer GPUs, and that means Nvidia has higher CPU overhead than AMD in DX12 for this reason? Here is a video about it that Hardware Unboxed made, and I was always wondering if this hardware scheduler in AMD GPUs has been used at all.
Leave a comment:
-
Originally posted by jorgepl View Post
I would love to know more about this topic, and also the pros and cons of implicit vs explicit sync model. AFAIK it's not a limit sync model, but a Wayland sync model, but that's as far as I could read on the internet. I can't find more info. Do you know where could I read more about this?
So if you think about how stuff work in modern OpenGL, you create buffers containing things like vertex coordinates, vertex indices, normals, textures and whatnot that make up a scene. And then you have things called shaders, small programs that work on per-vertex and per-fragment (~pixel) that the driver compiles. And then you upload all these buffers and compiled shaders to the GPU, and tell the GPU to start rendering.
For this kind of simple usage implicit sync works fine. Implicit here meaning that the order of events is whatever the order that the host CPU submitted them to the GPU. This is easier for the programmer, as there is not need to do some, well, explicit synchronization, and it works fine as long as you have a single CPU core interacting with the GPU, and all the flow in the rendering pipeline is FIFO (that is, host prepares work, uploads to the GPU which renders it and outputs to the screen, no need for any kind of back-and-forth).
But what if you have, say, multiple CPU's that you want to use for creating these buffers and sending commands to the GPU (this is a huge pain or outright impossible in OpenGL, but it's explicitly the modern world which Vulkan was designed for)? Or you have buffers that you want the GPU to do some work on, then you want to download some result of this work to the host (while doing other GPU interaction while waiting for this work to complete), do some work with it and based on that submit more work to the GPU? Or you have multiple GPU's? This is where the implicit sync method starts to break down, as the point of having many CPU cores interacting with the GPU is that you want them to run along independently as much as possible and not be tightly synced with each other.
So this is where explicit sync comes in, which is a model where everything by default runs independently, and where you want to enforce some ordering (say, tell the GPU to not start rendering a frame before all the buffers and other needed inputs are ready) you have to use explicit synchronization constructs.
In Mesa-land there seems to be a consensus that explicit sync is the future (most other platforms have already moved to such a model), the debate is about- How to move to an explicit sync model without breaking all the existing code that assumes implicit sync.
- What can be done in userspace, and what must be done in kernel space.
- Likes 10
Leave a comment:
-
Originally posted by jorgepl View Post
I would love to know more about this topic, and also the pros and cons of implicit vs explicit sync model. AFAIK it's not a limit sync model, but a Wayland sync model, but that's as far as I could read on the internet. I can't find more info. Do you know where could I read more about this?
I don't think it's really a Wayland issue, per se.
More like that's how the linux kernel DRM graphics work, because that's how OpenGL and the entire existing open source graphics stack works. So Wayland got built around that too, because it was designed around OpenGL and already a large change to get everyone to adopt - see how that still hasn't happened yet.
Meanwhile Vulkan has come along with the new explicit model, and now everyone wants to switch over to that, but it's a big task to try and get everything switched over. If you've got a centralized development, like, say, Microsoft or Apple, you can have everyone work to get everything ported over all at once. But in OSS it's like herding cats to get something like that accomplished. Again, see Wayland adoption.
- Likes 4
Leave a comment:
-
Originally posted by shmerl View PostBut I mean may be some sort of compilation part can fit such kind of vectorized architecture? I'm Just guessing, I don't expect like a whole compiler to run there even if it's Turing complete
- Likes 8
Leave a comment:
-
Originally posted by Venemo View Post
Most current applications generate a command buffer on the CPU, and then ask the GPU to execute that. A command buffer is basically a list of draws, copies, dispatches etc. which is executed by the GPU's command processor.
This new feature Bas is working on, makes it possible for applications to generate a command buffer on the GPU, and then executing that on the GPU without CPU intervention.
The main difficulty here is that these command buffers are more difficult to debug and analyze.
I see, thanks for clarifying.
Originally posted by Venemo View PostIt's very different.
Last edited by shmerl; 26 April 2022, 02:28 AM.
Leave a comment:
Leave a comment: