Are OpenGL Threaded Optimizations Responsible For NVIDIA's Faster Linux Performance?

Written by Michael Larabel in Display Drivers on 22 March 2015 at 10:30 AM EDT. Page 1 of 3. 33 Comments.

With the recent BioShock Infinite Linux benchmark results and the big Metro Redux graphics card comparison on Linux, a fair number of Linux gamers have been bringing it up in the forums with their hypothesis that NVIDIA's Linux gaming performance wins is due to their driver supporting OpenGL threaded optimizations. Well, that's not always the case, as shown in this article with a __GL_THREADED_OPTIMIZATIONS comparison in many Linux games with different GeForce hardware.

First off, BioShock Infinite does enable the __GL_THREADED_OPTIMIZATIONS setting by default to allow for the threaded driver optimizations, including some other Steam games. However, the Metro 2033 Redux and Metro Last Light Redux games don't set __GL_THREADED_OPTIMIZATIONS by default as part of their launch script. This environment variable has been around since the end of 2012 for offloading its CPU computation work to a separate worker thread. The NVIDIA Linux driver has long advertised that this optimization can help CPU-intensive workloads but could regress the performance of games that depend on synchronous OpenGL calls, thus NVIDIA doesn't enable this option by default and leaves it hidden behind an environment variable.

Last year I did provide some modern NVIDIA Threaded OpenGL optimization tests that found the results to be mixed. Given the recent Steam Linux games, I decided to test the latest titles on the newest NVIDIA driver to see how much of an impact there was in performance. The workloads tested with __GL_THREADED_OPTIMIZATIONS=0 and __GL_THREADED_OPTIMIZATIONS=1 forced (by setting the environment variable and removing the per-game launch script overrides where necessary) were BioShock Infinite, Metro 2033 Redux, Metro Last Light Redux, Tesseract, Unigine Valley, and Xonotic.

NVIDIA GPU Linux Threaded Optimizations

The graphics cards tested for each of the workloads was a GeForce GTX 750, GeForce GTX 680, and GeForce GTX 980. Aside from toggling the threaded optimization value for the NVIDIA binary driver, the other settings were at their defaults. All of the benchmarks were facilitated in a fully-automated manner using the open-source Phoronix Test Suite benchmarking software.


Related Articles