Intel Working To Improve The Reset Experience During GPU Hangs

Written by Michael Larabel in Intel on 28 September 2018 at 07:31 AM EDT. 13 Comments
INTEL
Driven to improve the Chrome OS user-experience, Intel open-source developers have been working on improving their GPU reset behavior when encountering problems under 3D/multimedia workloads.

Carlos Santa of Intel is presenting their latest work on a low-latency GPU engine-based reset mechanism. The current behavior is that the UI freezes followed by a black screen and system reboot, which can happen after unexpected GPU behavior after hours of usage.

Under the current design, a full GPU reset happens where as the approach being pursued is being able to reset just the particular engine that's hung. That full GPU reset generally interrupts the user experience and what can result in the black screen and/or system reboot.

This per-engine resetting relies upon timeout detection and recovery for resetting engines independently by having the UMD media driver utilize a watchdog timer when sending batch buffers. The GPU driver in turn is only resetting the affected engines/blocks after the timeout occurs.


This "TDR" approach is what we wrote about months ago but is still working its way to the mainline kernel. The watchdog components, GuC integration, and other bits are still pending, but hopefully we'll see it settled in the months ahead.

More details in this slide deck (PDF).
Related News
About The Author
Michael Larabel

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.

Popular News This Week