Originally posted by tildearrow
View Post
I do sometimes notice line voltage fluctuation when this happens, but no matter how good the voltmeter on my UPS is, it's not going to react quickly enough to see an instantaenous drop. What I thought was that it happened when the voltage was high, and suddenly dropped low, in other words, it's the delta that something is reacting to when the card is in low clock mode. I happened to see it once, the line voltage was about 127V (yes, that's real, I've compared the voltmeter on my UPS's OSD with multitester voltmeter) and dropped down to about 113V and came back up to about 124.
It has never happened when the card is in high clock mode, for example while playing games. It's more likely to happen when reading a web page, or especially while doing nothing. I have also never come back and found my system dead when display power management has shut the screen off, though I have come back after about 2 minutes (refill my coffee) to find it dead, with the screen black and back-lit. Nothing going on, all programs except desktop environment closed.
I am using Plasma 5 on both my Kubuntu (for games) and Arch (for work and life) now but this has also happened with XFCE in a very simple Crux distro.
I roll my own kernels, as I have been doing for decades.
I have ASPM disabled (I have always just enabled it in kernel and set "Performance" mode) but I have been using pcie_aspm=off for some time now. It's something you have to enable in kernel or you can't disable it :-)
It happened to me in Windows once, I had just clicked a menu in Firefox. HOWEVER, I don't use Windows very often anymore and while I'm there I'd be very unlikely to be idle (I'd be playing a game).
The problem could occur 3 times in one day, or go a week without happening. This is why I think it's external factors, for example line conditions.
My guess would be AMD drivers and firmware blobs reacting adversely.
Driver recovery has NEVER really worked properly for me since I bought this card. I used to have assloads of problems in Windows with early AMD drivers. It got to the point where I just set Windows TDR ("time out detection and recovery") to blue screen stop error (BugCheck on timeout) instead of recovery. It's marginally better than a hard power off, as it at least syncs buffers. I don't seem to have that particular problem in Windows anymore, but I am using Windows 7 and sticking with AMD drivers that have no problems, not upgrading them for the sake of it. I don't really have those kinds of game crashes to invoke TDR.
My graphics card is a MSI brand R9 380.
Leave a comment: