Intel Linux Driver Patches Yield 10~63% Faster Performance For Select Gen12/TGL GPUs
Users of various Intel Tiger Lake graphics and other "Gen12" graphics SKUs like the DG1 discrete graphics cards could soon be seeing a huge performance speed-up with the open-source Linux driver.
It turns out there is a sizable performance bottleneck right now in the Intel Gen12 graphics driver support on Linux when using hardware with less than 96 execution units. In turn with patches to address this shortcoming, OpenGL/Vulkan performance improvements can be north of 10% to 63% faster compared to the current state.
The current defect stems from relying on pre-programmed pixel pipe hashing tables. Those static tables were created on the assumption that all pixel pipes have the same processing power, which doesn't hold true when running on hardware with fused configurations where some pixel pipes can be missing subslices. The current hashing tables thus lead to a "serious bottleneck" on the hardware with lower EU counts.
The pending work changes the behavior to calculate a pixel-pipe hashing table that is balanced to the computational power in each pixel pipe. Intel Tiger Lake and DG1 graphics with less than 96 execution units are likely to see the most dramatic improvement. The results are very tantalizing, "an FPS improvement that has been observed to range between 10% and 63% for most non-trivial graphics workloads I've tried on an 80 EU TGL platform."
The improvement hasn't yet been merged into Mesa as it's awaiting more widespread testing first on different Intel graphics hardware in other EU configurations. Those with Gen12 graphics wanting to test (or Gen11 Ice Lake to ensure no regressions) can find these exciting patches via this MR. Unfortunately my lone Tiger Lake / Gen12 hardware (purchased retail...) is the i7-1165G7 with 96 EUs but will run some benchmarks anyhow to confirm no surprises.
It turns out there is a sizable performance bottleneck right now in the Intel Gen12 graphics driver support on Linux when using hardware with less than 96 execution units. In turn with patches to address this shortcoming, OpenGL/Vulkan performance improvements can be north of 10% to 63% faster compared to the current state.
The current defect stems from relying on pre-programmed pixel pipe hashing tables. Those static tables were created on the assumption that all pixel pipes have the same processing power, which doesn't hold true when running on hardware with fused configurations where some pixel pipes can be missing subslices. The current hashing tables thus lead to a "serious bottleneck" on the hardware with lower EU counts.
The pending work changes the behavior to calculate a pixel-pipe hashing table that is balanced to the computational power in each pixel pipe. Intel Tiger Lake and DG1 graphics with less than 96 execution units are likely to see the most dramatic improvement. The results are very tantalizing, "an FPS improvement that has been observed to range between 10% and 63% for most non-trivial graphics workloads I've tried on an 80 EU TGL platform."
The improvement hasn't yet been merged into Mesa as it's awaiting more widespread testing first on different Intel graphics hardware in other EU configurations. Those with Gen12 graphics wanting to test (or Gen11 Ice Lake to ensure no regressions) can find these exciting patches via this MR. Unfortunately my lone Tiger Lake / Gen12 hardware (purchased retail...) is the i7-1165G7 with 96 EUs but will run some benchmarks anyhow to confirm no surprises.
3 Comments