While debugging my problems with Serious Sam 3: BFE on Debian stretch with Mesa 10.6.3, I came across an alarmingly large number of the following log lines from the radeon driver in the kernel:
[22097.775805] radeon 0000:05:00.0: GPU fault detected: 147 0x03e84401
[22097.775807] radeon 0000:05:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x0FF0081F
[22097.775808] radeon 0000:05:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x08044001
[22097.775809] VM fault (0x01, vmid 4) at page 267388959, read from TC (68)
I checked to see how many there were and there are indeed a LOT:
egrep "GPU fault|VM_CONTEXT1|VM fault" /var/log/kern.log | wc -l
25808741
Prior to booting my system a few days ago with a freshly compiled Linux 4.1.4 kernel and the Mesa 10.6.3 stack, I had never seen these messages before under any circumstance, especially when gaming or otherwise stressing the graphics card. I've already checked and the 4.1.5 kernel doesn't have any patches applied to the radeon driver yet, so I'm hoping that this is indeed a kernel bug and not a sign of a defective memory chip or a flaky power supply. I use lm-sensors to monitor CPU and GPU temps and neither has spiked alarmingly yet
lspci:
05:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde XT [Radeon HD 7770/8760 / R7 250X]
glxinfo:
OpenGL renderer string: Gallium 0.4 on AMD CAPE VERDE
OpenGL core profile version string: 3.3 (Core Profile) Mesa 10.6.3
OpenGL core profile shading language version string: 3.30
Does anyone have any advice on how I should proceed here? I'm going to check elsewhere to see if these log lines have been reported elsewhere, but quick Google searches haven't turned up anything yet.
[22097.775805] radeon 0000:05:00.0: GPU fault detected: 147 0x03e84401
[22097.775807] radeon 0000:05:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x0FF0081F
[22097.775808] radeon 0000:05:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x08044001
[22097.775809] VM fault (0x01, vmid 4) at page 267388959, read from TC (68)
I checked to see how many there were and there are indeed a LOT:
egrep "GPU fault|VM_CONTEXT1|VM fault" /var/log/kern.log | wc -l
25808741
Prior to booting my system a few days ago with a freshly compiled Linux 4.1.4 kernel and the Mesa 10.6.3 stack, I had never seen these messages before under any circumstance, especially when gaming or otherwise stressing the graphics card. I've already checked and the 4.1.5 kernel doesn't have any patches applied to the radeon driver yet, so I'm hoping that this is indeed a kernel bug and not a sign of a defective memory chip or a flaky power supply. I use lm-sensors to monitor CPU and GPU temps and neither has spiked alarmingly yet
lspci:
05:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde XT [Radeon HD 7770/8760 / R7 250X]
glxinfo:
OpenGL renderer string: Gallium 0.4 on AMD CAPE VERDE
OpenGL core profile version string: 3.3 (Core Profile) Mesa 10.6.3
OpenGL core profile shading language version string: 3.30
Does anyone have any advice on how I should proceed here? I'm going to check elsewhere to see if these log lines have been reported elsewhere, but quick Google searches haven't turned up anything yet.
Comment