When publishing ATI Gallium3D benchmarks this week that compared the performance of the Radeon HD 4870 and Radeon HD 5770 graphics cards with this next-generation driver architecture to the classic open-source Mesa driver and AMD's high-performance proprietary Catalyst driver, the results were what one would mostly expect. The Gallium3D driver was faster than the classic Mesa driver in most tests, but both drivers seriously lagged behind the proprietary driver. Even on older generation ATI Radeon graphics cards this is the case. This though has led many to effectively ask, "what's keeping the open-source drivers from performing like the proprietary driver?" It all comes down to low-level optimizations as is discussed in this Phoronix Forums thread. There are very large development teams for the different Catalyst driver components within AMD and much of this work is shared across all platforms, but on the open-source Linux side there's very few paid full-time developers and just a number of part-time, community developers to cover the entire driver stack.
As discussed by these developers in the aforementioned thread and elsewhere within our forums, these large development teams at AMD (as well as NVIDIA) responsible for the proprietary driver have nearly limitless resources in comparison to the X.Org / Mesa developers. The proprietary driver architectures are more advanced than the current open-source drivers (even those based on Gallium3D) with a better memory manager, multi-threaded support, and many other other optimizations, especially as these developers have direct contact to the hardware design engineers and are never held up on waiting for documentation to be sanitized, etc. While there is some "low hanging fruit" for optimizations with the open-source drivers that can be achieved, it is also a matter of finding the most efficient and effective optimizations to tackle. As said by AMD's John Bridgman, "The first task would be to find the bottlenecks. With a graphics driver that is usually a pretty significant task on its own."
There is actually a module/extension I am working on for the Phoronix Test Suite stack as part of the Iveland development efforts that may actually help such open-source GPU driver developers and other open-source developers (and PTS commercial clients) in spotting areas to focus upon for performance optimizations with graphics drivers and other sub-systems. It's codenamed Karsk and effectively combines the Phoronix Test Suite, a system performance profiler, and OpenBenchmarking.org to create one powerful combination. Karsk takes advantage of the automated benchmarking capabilities (for quantitative tests as well as qualitative tests like game visuals) with a plethora of available test profiles for different sub-systems, combined with a system performance profiler (such as Sysprof) to profile the system with timing the different system function calls and other tasks, then pipe that back into the Phoronix Test Suite via its module framework, and then lastly leverage OpenBenchmarking.org as a collaborative testing platform (or also via Phoromatic) to effectively crowd-source this effort to the community. With OpenBenchmarking.org/Phoromatic you can then gather results from many different systems and hardware combinations to rule out any system-specific slowdowns so that any performance optimization efforts will target the largest amount of users/customers.
The name Karsk originates in Norway and means "quick" -- the speed at which you will be able to work on performance optimizations -- while more popularly it's known as a coffee moonshine within their unique culture.
In the case of open-source GPU drivers, with the Karsk module you could run a variety of test profiles for different OpenGL applications and games (there's over a dozen or two in the Phoronix Test Suite right now) to collect and generalize as much data as possible, have the Phoronix Test Suite module start and stop the system profiler (Sysprof) to run the monitoring process for the length of the test and send the traced kernel-space and user-space data back to the Phoronix Test Suite so that it can append the information to the result file both on an individual test basis and averaged for all tests executed. This is faintly similar to how the Phoronix Test Suite system monitor module can append CPU/GPU/SWAP/RAM usage, CPU/GPU/system temperatures, and other metrics to Phoronix Test Suite results automatically. As this data is sent back into the Phoronix Test Suite, it then becomes possible to easily compare the data against other test runs on that system or other systems, can be merged with other data sets, and is then compatible with Phoromatic / OpenBenchmarking.org for the results sharing and collaborative testing. With the traced data going back into the Phoronix Test Suite result file, it also binds the data to the actual performance result so that the Phoronix Test Suite can even tell you automatically if an attempted performance optimization was successful or not -- and if the optimization had adverse effects in any other tests.
Once the developers use this information and think they have any optimizations ready, the Phoronix Test Suite and OpenBenchmarking.org can then be used again for confirming the change and moving on to the next area of improvement. Compared to how profiling is generally done at this point to "find the bottlenecks", the Phoronix Test Suite with Karsk vastly simplifies and expedites the process. Obviously this also lowers the barrier for users interested in helping out software projects in making performance improvements thanks to automated processes. While subsystem drivers are what's being talked about in these examples, there's nothing stopping those from using Karsk elsewhere on the operating system to profile and optimize more user-space areas like a compositing window manager or applications having explicit performance requirements like high frequency trading platforms, especially with the OpenBenchmarking.org package management system supporting public and private third-party test profiles.
This latest work is somewhat similar to how the Phoronix Test Suite has been used to easily track down regressions in the Linux kernel (and other software components on a revision control system via the Phoronix Test Suite bisect module), generating an ideal kernel configuration by automatically placing a system under test in all possible combinations, and other random attempts to find more uses for the Phoronix Test Suite, but this time it's a matter of latching on a system profiler to provide an entirely different perspective to complement the test results. Thanks to the abstraction and module framework, there's also no reason why Karsk cannot be used in conjunction with Phoromatic Tracker when doing things like benchmarking the Linux kernel daily (or the latest Ubuntu packages) to also report the areas where the most kernel and user-space time is being spent with the ever-changing state of what's being tested.
If all else fails, you can at least drink some Karsk (like the codenames for the Phoronix Test Suite releases, this is coming from a Nordic term) to make the process seem easier or take an "enlightened" stab in the dark to see what easy performance optimizations you can find... While our Karsk is not yet ready for public consumption, those that may be interested in trying it out when we are comfortable and ready, contact us.
Discuss this article in our forums, IRC channel, or email the author. You can also follow our content via RSS and on social networks like Facebook, Identi.ca, and Twitter (@Phoronix and @MichaelLarabel). Subscribe to Phoronix Premium to view our content without advertisements, view entire articles on a single page, and experience other benefits.