Mesa R500 Texture Semaphore Improvements

Published on October 10, 2011
Written by Michael Larabel
Page 1 of 3
Discuss This Article

Tom Stellard, the former Google Summer of Code student who worked on ATI R300 GLSL compiler improvements and a new register allocator, has been looking into the area of Radeon OpenCL support while now being employed by AMD. However, Tom is working on other open-source Radeon work too. Recently he made improvements to the R300g driver's instruction scheduler to make better use of the texture semaphore.

As he mentioned on his personal blog late last month, "The texture semaphore is used by instructions that need to read texture data to tell the ALU to delay execution until the desired texture data has been fetched from the texture unit. Previously in the r300g compiler, all instructions were using this semaphore, so even instructions that didn't need texture data were waiting for it to be fetched. With these improvements, we are able to prefetch texture data by placing instructions that don't depend on texture data directly after texture look ups, so they execute while the data is being fetched. This should lead to some performance improvements for certain kinds of shaders. In Lightsmark, there is one shader in particular that really benefits from this optimization, and I'm getting about a 33% speed up in overall FPS, with these new changes on my RV515."

While part of the R300 Gallium3D driver, this work is only relevant to the ATI Radeon X1000 (R500) series. With his report of such huge performance gains in shader-using OpenGL workloads, such as Lightsmark, I couldn't help but to run some benchmarks as soon as returning from Oktoberfest.

The instruction scheduler enhancements for the texture semaphore have not yet been merged to Mesa master, but are currently living in Stellard's personal Git repository. His Mesa repository is on FreeDesktop.org and this work is currently living in the "tex-sem" branch, but will hopefully be merged to mainline Mesa in the near future. This texture semaphore work also hooks into a new debugging environmental variable, RADEON_TEX_GROUP. This environment variable allows manipulating the maximum number of texture look-ups to submit concurrently. The default number of texture look-ups to submit at once is eight, but Tom says the best performing number may be different depending upon the application and graphics processor.

Stellard's tex-sem branch also offers a few other improvements, such as a smarter instruction scheduler and the re-enabling of the register rename pass to enhance all compiler optimizations. It is interesting work for this open-source Gallium3D driver targeting older Radeon hardware.

For this article I compared the performance of Tom Stellard's tex-sem branch of Mesa against mainline Mesa, as of 6 October 2011. The latest Linux 3.1 kernel as of the same date was used. Via the xorg.conf, swap buffers wait was also disabled (and color tiling is already enabled by default for the R500 series).

The graphics cards tested were an ATI Radeon X1800XL, ATI Radeon X1800XT, and ATI Radeon X1950PRO. Unfortunately, last month I gave to Martin Graesslin (the KDE KWin maintainer) the X1300PRO graphics card as he doesn't have any R300/400/500 class hardware and he's working to debug some R300g driver issues with the KWin compositing window manager, so this instruction scheduler testing is limited to just three higher-end R500 GPUs.

<< Previous Page
1
Latest Hardware Reviews
  1. Sumo Lounge Emperor
  2. Gallium3D Continues Improving OpenGL For Older Radeon GPUs
  3. 15-Way Open vs. Closed Source NVIDIA/AMD Linux GPU Comparison
  4. Nouveau vs. NVIDIA Linux Comparison Shows Shortcomings
Latest Software Articles
  1. Intel Linux OpenGL Driver Leading Over Apple OS X
  2. The Cost Of Ubuntu Disk Encryption
  3. Btrfs vs. EXT4 vs. XFS vs. F2FS On Linux 3.10
  4. AMD Radeon R600 GPU LLVM 3.3 Back-End Testing
Latest Linux News
  1. LLVM Clang 3.3 RC2 Is Ready For Testing
  2. AMD RadeonSI Gallium3D Begins Simple CL Demos
  3. Intel Shows Off GNOME3-Based Tizen Shell
  4. Linux Desktop Security Could Be A Whole Lot Better
  5. KDE 4.11 Will Be The Last Major KDE4 Workspaces Feature Release
  6. New NVIDIA Linux Driver Supports The GeForce GTX 780
  7. Chrome 28 To Offer More Speed Improvements
  8. Digia Announces "Boot To Qt" Project
  9. X.Org Libraries Hit By Round Of Security Issues
  10. Wayland's Weston Gets Output Scaling Support
  11. Raspberry Pi Gets New Wayland Weston Renderer
Latest Forum Talk
  1. Linux Desktop Security Could Be A Whole Lot Better
  2. X.Org Libraries Hit By Round Of Security Issues
  3. Intel Shows Off GNOME3-Based Tizen Shell
  4. VIA KMS Driver Now Supports HDMI Output
  5. AMD RadeonSI Gallium3D Begins Simple CL Demos
  6. AMD Radeon R600 GPU LLVM 3.3 Back-End Testing
  1. Computers
  2. Display Drivers
  3. Graphics Cards
  4. Motherboards
  5. Peripherals
  6. Processors
  7. Software
  8. Operating Systems
  9. All Articles
  1. Linux Benchmarking
  2. OpenBenchmarking.org
  3. Phoronix Test Suite