The Nasty Linux 6.3 Nouveau Driver Bug Appears To Have Been Figured Out
As a follow-up to the potentially nasty open-source NVIDIA "Nouveau" driver bug in Linux 6.3, the issue is believed to have been figured out and a patch is pending that appears to address the issue.
A warning was sent out a few days ago to avoid using Nouveau on the current Linux 6.3 stable series due to a use-after-free issue within this kernel graphics driver. The use-after-free can lead to kernel memory corruption and in turn could potentially cause file-system corruption or other system issues, not to mention being a possible security issue as well.
Red Hat's David Airlie believes now he has sorted through the issue from the month-old bug report. Airlie posted a patch yesterday as a proposed fix. So far both Nouveau developer Karol Herbst at Red Hat as well as another user previously plagued by this problem have indicated the use-after-free issue is now avoided.
Airlie explained when posting the patch to dri-devel:
With a few lines of code, the problem is hopefully resolved.
At the moment the patch is still sitting on the mailing list but will presumably be sent in for the next round of DRM-Fixes being sent in for the mainline kernel.
A warning was sent out a few days ago to avoid using Nouveau on the current Linux 6.3 stable series due to a use-after-free issue within this kernel graphics driver. The use-after-free can lead to kernel memory corruption and in turn could potentially cause file-system corruption or other system issues, not to mention being a possible security issue as well.
Red Hat's David Airlie believes now he has sorted through the issue from the month-old bug report. Airlie posted a patch yesterday as a proposed fix. So far both Nouveau developer Karol Herbst at Red Hat as well as another user previously plagued by this problem have indicated the use-after-free issue is now avoided.
Airlie explained when posting the patch to dri-devel:
"This seems to have existed for ever but is now more apparant after 9bff18d13473a9fdf81d5158248472a9d8ecf2bd (drm/ttm: use per BO cleanup workers)
My analysis:
two threads are running, one in the irq signalling the fence, in dma_fence_signal_timestamp_locked, it has done the DMA_FENCE_FLAG_SIGNALLED_BIT setting, but hasn't yet reached the callbacks.
second thread in nouveau_cli_work_ready, where it sees the fence is signalled, so then puts the fence, cleanups the object and frees the work item, which contains the callback.
thread one goes again and tries to call the callback and causes the use-after-free.
Proposed fix:
lock the fence signalled check in nouveau_cli_work_ready, so either the callbacks are done or the memory is freed."
With a few lines of code, the problem is hopefully resolved.
At the moment the patch is still sitting on the mailing list but will presumably be sent in for the next round of DRM-Fixes being sent in for the mainline kernel.
25 Comments