Show Your Support: Have you heard of Phoronix Premium? It's what complements advertisements on this site for our premium ad-free service. For less than $4 USD per month, you can help support our site while the funds generated allow us to keep doing Linux hardware reviews, performance benchmarking, maintain our community forums, and much more.
Intel Working To Improve The Reset Experience During GPU Hangs
Carlos Santa of Intel is presenting their latest work on a low-latency GPU engine-based reset mechanism. The current behavior is that the UI freezes followed by a black screen and system reboot, which can happen after unexpected GPU behavior after hours of usage.
Under the current design, a full GPU reset happens where as the approach being pursued is being able to reset just the particular engine that's hung. That full GPU reset generally interrupts the user experience and what can result in the black screen and/or system reboot.
This per-engine resetting relies upon timeout detection and recovery for resetting engines independently by having the UMD media driver utilize a watchdog timer when sending batch buffers. The GPU driver in turn is only resetting the affected engines/blocks after the timeout occurs.
This "TDR" approach is what we wrote about months ago but is still working its way to the mainline kernel. The watchdog components, GuC integration, and other bits are still pending, but hopefully we'll see it settled in the months ahead.
More details in this slide deck (PDF).