Marek Continues Improving Radeon Performance

Written by Michael Larabel in Radeon on 6 November 2012 at 06:31 AM EST. 5 Comments
RADEON
There's been another improvement to Mesa with the Radeon Gallium3D R600 driver by Marek Olšák that can improve the OpenGL performance in certain situations for this open-source AMD Linux driver while also conserving memory usage.

Marek Olšák, the student developer from Europe who's independently made significant contributions to Mesa/Gallium3D and particularly the open-source AMD Radeon graphics drivers, is continuing to do more. Last week he worked out two more performance patches to try to better the open-source driver's performance against the AMD Catalyst proprietary driver following some disappointing performance results in a Phoronix article. Last week he also enabled 2D color tiling for the more recent Radeon graphics hardware on this open-source driver, another performance win.

Pushed to Mesa's mainline Git repository last night was a new patch by Marek that adds in-place depth buffer de-compression and texturing with the depth buffer tiling. His patch explains:
The decompression is done in-place and only the compressed tiles are decompressed. Note: R6xx-R7xx can do that only with Z16 and Z32F.

The texture unit is programmed to use non-displayable tiling and depth ordering of samples, so that it can fetch the texture in the native DB format.

The latest version of the libdrm surface allocator is required for stencil texturing to work. The old one didn't create the mipmap tree correctly. We need a separate mipmap tree for stencil, because the stencil mipmap offsets are not really depth offsets/4.

There are still some known bugs, but this should save some memory and it also improves performance a little bit in Lightsmark (especially with low resolutions; tested with Radeon HD 5000).
Saving on memory while also being able to improve the performance a bit is certainly much appreciated.

The Radeon Mesa support now requires libdrm 2.4.40, which was released yesterday, for the stencil mip-map allocator for combined depth-stencil buffers.

Some might also be interested in comments Marek made recently within the forums where he says, "we're fighting a battle we can't win", in terms of competing with the Catalyst driver offerings on performance.
I expected worse results after seeing the bug report about Unigine Heaven. Anyway, we don't have many options at the moment (I see only one: reverting the commit). The mechanism that decides where buffers are placed (VRAM or GTT) and which buffers are moved when we start to run out of memory must be overhauled. This is a bigger project and I don't have time for it right now. The kernel DRM interface might need some changes. We also need good tools to detect bottlenecks and a good GPU resource monitor. Right now if you run out of GPU memory, there's no easy way to know and definitely no way to know what is eating the memory. We're mostly blind right now.

However, we're fighting a battle we can't win. S3TC textures need 4x to 8x less memory and would help a lot with this problem. Any driver with S3TC support has a great advantage over a driver without one.

We could also cheat by using the BC7 format for plain RGBA8 textures. That would be a win if we implemented the BC7 encoding on the GPU.
Related News
About The Author
Michael Larabel

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.

Popular News This Week