1. Computers
  2. Display Drivers
  3. Graphics Cards
  4. Memory
  5. Motherboards
  6. Processors
  7. Software
  8. Storage
  9. Operating Systems


Facebook RSS Twitter Twitter Google Plus


Phoronix Test Suite

OpenBenchmarking.org

Why The Radeon Gallium3D Performance Is Down

Michael Larabel

Published on 23 December 2011
Written by Michael Larabel
Page 2 of 4 - 16 Comments

Combine the Phoronix Test Suite with a fast Intel CPU and most any change in performance can be quickly located. With the OpenArena Mesa 7.12-devel drop in frame-rate this was no different. The Radeon X1950PRO graphics card was again used from the Core i7 870 desktop, all the details are hosted on this OpenBenchmarking.org page.

So what did the Phoronix Test Suite find? The performance drop in OpenArena for the current Mesa 7.12-devel Git head is caused by ef64da8f013691c66744064769db379e57ef95de -- a.k.a. winsys/radeon: don't use the new GEM_WAIT ioctl for now -- was the biggest offender that's currently impairing the OpenArena performance for the RV570 graphics card. This change by Marek Olšák to the Radeon winsys for Gallium3D simply disables using the GEM_WAIT ioctl code-path, even when a Linux kernel is in use that supports this particular ioctl.

This commit is not too terribly surprising, since the GEM_WAIT ioctl is known to increase performance (particularly for CPU-bottlenecked graphics workloads) and is talked about in this Phoronix article. The DRM_RADEON_GEM_WAIT ioctl was only introduced a few months back, so unless using a very new Linux kernel snapshot chances are you would not have noticed the boost in performance in the first place. Here is a partial explanation of this work from when Marek was originally pushing the kernel changes:

Sometimes we want to know whether a buffer is busy and wait for it (bo_wait). However, sometimes it would be more useful to be able to query whether a buffer is busy and being either read or written, and wait until it's stopped being either read or written. The point of this is to be able to avoid unnecessary waiting, e.g. if a GPU has written something to a buffer and is now reading that buffer, and a CPU wants to map that buffer for read, it needs to only wait for the last write. If there were no write, there wouldn't be any waiting needed.

This, or course, requires user space drivers to send read/write flags with each relocation (like we have read/write domains in radeon, so we can actually use those for something useful now).

However, the DRM_RADEON_GEM_WAIT ioctl work was introduced to Mesa in August past the 7.11 release. There is another problem at hand.

Latest Linux Hardware Reviews
  1. MSI X99S SLI PLUS On Linux
  2. NVIDIA GeForce GTX 970 Offers Great Linux Performance
  3. CompuLab Intense-PC2: An Excellent, Fanless, Mini PC Powered By Intel's i7 Haswell
  4. From The Atom 330 To Haswell ULT: Intel Linux Performance Benchmarks
Latest Linux Articles
  1. RunAbove: A POWER8 Compute Cloud With Offerings Up To 176 Threads
  2. 6-Way Ubuntu 14.10 Linux Desktop Benchmarks
  3. Ubuntu 14.10 XMir System Compositor Benchmarks
  4. Btrfs RAID HDD Testing On Ubuntu Linux 14.10
Latest Linux News
  1. openSUSE Factory & Tumbleweed Are Merging
  2. More Fedora Delays: Fedora 21 Beta Slips
  3. Mono Brings C# To The Unreal Engine 4
  4. Coreboot Now Has Support For Intel Broadwell Hardware
  5. Enlightenment's EFL 1.12 Alpha Has Evas GL-DRM Engine, OpenGL ES 1.1 Support
  6. GTK+ Lands Experimental Backend For Mir Display Server
  7. Ubuntu 14.10 Officially Released
  8. Mesa 10.4 Might Re-Enable HyperZ For R600g/RadeonSI
  9. Intel GVT-g GPU Virtualization Moves Closer
  10. GTK+ 3.16 To Bring Several New Features
Latest Forum Discussions
  1. Ubuntu 16.04 Might Be The Distribution's Last 32-Bit Release
  2. Updated and Optimized Ubuntu Free Graphics Drivers
  3. Linux hacker compares Solaris kernel code:
  4. HOPE: The Ease Of Python With The Speed Of C++
  5. Advertisements On Phoronix
  6. Users/Developers Threatening Fork Of Debian GNU/Linux
  7. AMD Releases UVD Video Decode Support For R600 GPUs
  8. Proof that strlcpy is un-needed