Originally posted by pingufunkybeat
View Post
Announcement
Collapse
No announcement yet.
Radeon Gallium3D R600g Color Tiling Performance
Collapse
X
-
Originally posted by pingufunkybeat View PostIn your case, it is about 4ms faster at rendering a single texture.
Optimising 4ms away is really hard work, especially if it consists of 100 different miliseconds collected across different parts of the driver. That's what my armchair response was about.
The driver's bottleneck is the CPU, it's where it works as a program, no? But low CPU usage, pretty much eliminates this possibility for this case. So GPU is the bottleneck. So something must be wrong there. Such a simple case, indicates that something is done wrong, a speed up feature not used or extra/different usage of the GPU is done. And it doesn't seem like many small ones, more like a couple bigger ones as mentioned. I doubt it took AMD 15 years to optimize rendering a texture(one). One possibility, may be invalid though, could it have to do with texture compression? Test it with a simple gradient instead and see there tooPMtmikov .. xD
Comment
-
Originally posted by Rigaldo View PostPlease elaborate on what optimizations should be done on rendering a single image? In such a simple process, pretty much the same "calls" to the gpu should be made, no?
The driver's bottleneck is the CPU, it's where it works as a program, no? But low CPU usage, pretty much eliminates this possibility for this case.
Even if your processor is mostly idle, a simple cache miss might cause a considerable delay while your GPU is waiting for the next instruction.
But again, I'm not a GPU developer. I just don't believe that using less than 100% of CPU all the time means that there are no bottlenecks in the driver. A 1ms delay is a 1ms delay, even if it only happens occasionally.
Comment
-
Originally posted by pingufunkybeat View PostLike I said, this is for driver developers to answer, I lack the knowledge. Marek and Alex have already written that all hardware functionality is used (only HiZ is not on by default). If I remember correctly, Jerome Glisse did profile the driver and couldn't find one single bottleneck, but many small ones. I can't find a link at the moment, perhaps somebody is better at googling.
Only if they operate completely asynchronously. If the GPU ever has to wait for the driver before continuing, then no.
Even if your processor is mostly idle, a simple cache miss might cause a considerable delay while your GPU is waiting for the next instruction.
But again, I'm not a GPU developer. I just don't believe that using less than 100% of CPU all the time means that there are no bottlenecks in the driver. A 1ms delay is a 1ms delay, even if it only happens occasionally.
Comment
-
Originally posted by tmikov View PostFair enough. But what would be a good synthetic stress test?
Also, do you have an idea why the blob is faster? Could it be memory clocks, power management, etc?
There is not a single thing that explain the gap btw open source driver and closed source driver. There is no secret way to do thing, we have tools to capture fglrx command stream and there is nothing fundamentally different. Proper power management support with use of on chip governor to manage clock will probably improve performance a bit, better buffer heuristic placement, better shader compiler, less cpu overhead, less cp stalling, ... many little things like those add up.
Comment
-
Originally posted by pingufunkybeat View PostLike I said, this is for driver developers to answer, I lack the knowledge. Marek and Alex have already written that all hardware functionality is used (only HiZ is not on by default). If I remember correctly, Jerome Glisse did profile the driver and couldn't find one single bottleneck, but many small ones. I can't find a link at the moment, perhaps somebody is better at googling.
Only if they operate completely asynchronously. If the GPU ever has to wait for the driver before continuing, then no.
Even if your processor is mostly idle, a simple cache miss might cause a considerable delay while your GPU is waiting for the next instruction.
But again, I'm not a GPU developer. I just don't believe that using less than 100% of CPU all the time means that there are no bottlenecks in the driver. A 1ms delay is a 1ms delay, even if it only happens occasionally.
Comment
-
This makes me wonder... It's just a wild guess, but are you making sure that the GPU is trying to render only your texture, and nothing else? If the test was running like glxgears, then the difference in performance could very well be due to the GPU having to render all the windows in the background and such as well as the test object. And even if it is running fullscreen, are there any guarantees that the GPU is not trying to render something in the background or offscreen, before drawing the test texture on top?
Comment
-
Originally posted by GreatEmerald View PostThis makes me wonder... It's just a wild guess, but are you making sure that the GPU is trying to render only your texture, and nothing else? If the test was running like glxgears, then the difference in performance could very well be due to the GPU having to render all the windows in the background and such as well as the test object. And even if it is running fullscreen, are there any guarantees that the GPU is not trying to render something in the background or offscreen, before drawing the test texture on top?
Comment
Comment