Experimental Patches For Using SIMD32 Fragment Shaders With Intel's Linux Driver
Existing Intel graphics hardware already supports SIMD32 fragment shaders and the Intel open-source Linux graphics driver has supported this mode for months, but it hasn't been enabled. That though is in the process of changing.
Since June the Intel Mesa driver's fragment shader code has supported the SIMD32 mode supported by the past number of generations of Intel graphics hardware, but it hasn't actually been turned on. That enabling wasn't done over not having the heuristics in place for determining when to enable it over the other code paths.
Intel developer Toni Lönnberg has posted a set of seven patches today providing some SIMD32 selection heuristics for the Intel Mesa driver. The heuristics are not complete but enough so that an Intel customer is happy with their performance out of their environment before having a proper solution in place. The SIMD32 handling in its current form is based on the number of enabled MRTs, number of grouped texture fetches, and the instruction count ratio between SIMD16 and SIMD32 modes.
When testing with Intel Broxton hardware, the biggest benefit was found in the GLBench5 ALU2 test case where performance is up by 38%. But there are a number of regressions in this current SIMD32 code that leds some tests like the GLBenchmark fill test degrading performance by about 7%.
The patches in their current form add a INTEL_DEBUG=heur32 environment variable switch for enabling the SIMD32 selection heuristics. These experimental patches were posted a short time ago to the Mesa list.
Since June the Intel Mesa driver's fragment shader code has supported the SIMD32 mode supported by the past number of generations of Intel graphics hardware, but it hasn't actually been turned on. That enabling wasn't done over not having the heuristics in place for determining when to enable it over the other code paths.
Intel developer Toni Lönnberg has posted a set of seven patches today providing some SIMD32 selection heuristics for the Intel Mesa driver. The heuristics are not complete but enough so that an Intel customer is happy with their performance out of their environment before having a proper solution in place. The SIMD32 handling in its current form is based on the number of enabled MRTs, number of grouped texture fetches, and the instruction count ratio between SIMD16 and SIMD32 modes.
When testing with Intel Broxton hardware, the biggest benefit was found in the GLBench5 ALU2 test case where performance is up by 38%. But there are a number of regressions in this current SIMD32 code that leds some tests like the GLBenchmark fill test degrading performance by about 7%.
The patches in their current form add a INTEL_DEBUG=heur32 environment variable switch for enabling the SIMD32 selection heuristics. These experimental patches were posted a short time ago to the Mesa list.
2 Comments