Intel Wires Up Dual-SIMD8 Dispatch For Mesa Drivers
Intel's open-source Linux drivers for OpenGL (Iris) and Vulkan (ANV) this week for Mesa 24.0 received support for dual-SIMD8 dispatch on Gen12 graphics (Tigerlake) and newer.
Dual-SIMD8 dispatch is supported on Intel graphics hardware since Tigerlake and should allow for better ALU utilization. However, this new code in Mesa 24.0 isn't being enabled by default at this time until further performance testing has occurred. It also looks like for Intel Xe2 graphics is where the performance benefits may become more apparent.
Intel Linux graphics driver developer Francisco Jerez explained:
See this merge request for all the details with the functionality now in place for Mesa 24.0 due out as stable in mid-Q1. Those with recent Intel integrated or discrete graphics hardware can experiment with this dual-SIMD8 dispatch via the INTEL_SIMD_DEBUG=fs2x8 environment variable. It's also interesting to hear the doubled ALU vector width and larger dispatch modes for Intel Xe2 graphics.
Dual-SIMD8 dispatch is supported on Intel graphics hardware since Tigerlake and should allow for better ALU utilization. However, this new code in Mesa 24.0 isn't being enabled by default at this time until further performance testing has occurred. It also looks like for Intel Xe2 graphics is where the performance benefits may become more apparent.
Intel Linux graphics driver developer Francisco Jerez explained:
"This MR implements support for multipolygon pixel shader dispatch which is supported by TGL hardware and later. On Gfx12.x hardware multipolygon PS dispatch is limited to 2 polygons per SIMD thread, and can in theory allow better ALU utilization than either plain SIMD8 or SIMD16 while rendering a large number of small polygons that can't utilize the ALUs efficiently in SIMD16 dispatch mode.
...
Note that since no major performance changes have been observed on Gfx12 this series doesn't enable dual-SIMD8 by default yet until further performance evaluation is completed, however it can be enabled manually via the INTEL_SIMD_DEBUG=fs2x8 environment variable. The main motivation for this series right now is to prepare the compiler for the additional multipolygon modes available on Gfx20+, which is likely to get a greater benefit from multipolygon dispatch than Gfx12 due to its doubled ALU vector width and larger variety of multipolygon dispatch modes."
See this merge request for all the details with the functionality now in place for Mesa 24.0 due out as stable in mid-Q1. Those with recent Intel integrated or discrete graphics hardware can experiment with this dual-SIMD8 dispatch via the INTEL_SIMD_DEBUG=fs2x8 environment variable. It's also interesting to hear the doubled ALU vector width and larger dispatch modes for Intel Xe2 graphics.
32 Comments