Announcement

Collapse
No announcement yet.

Bricked RX 560

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Originally posted by bridgman View Post
    My best guess right now is that you were more lucky than normal with the first card (sounds like it went into thermal shutdown a dozen or so times before finally sustaining damage) and less lucky than normal with the second card.
    Or, as another, less convenient guess, maybe some GPUs had better margins while some got screwed and unable to react before getting fried aka running on the edge/too small margins? Or maybe feature got broken. I've thought emergency overheat handled by firmware of some helper cpus, no? Just like DVFS. So it could be quite fast, no? As for draining caps, er, I've always thought power and clock gating happens inside IC itself and in this case large caps of VRM are out of equation, no? Isn't it possible to power/clock gate everything that could be gated under emergency? I also could remember overheat margins tend to be configurable (could vendor change it?). Maybe someone left too small margins? (I've seen quite some nvidias fried due to overheat, but it's hallmark of certain families of nvidias - high-end cards of older generations were running on the edge of their thermal design so they were really prone to failures).
    Last edited by SystemCrasher; 27 July 2017, 12:16 AM.

    Comment


    • #32
      Originally posted by SystemCrasher View Post
      Or, as another, less convenient guess, maybe some GPUs had better margins while some got screwed and unable to react before getting fried aka running on the edge/too small margins? Or maybe feature got broken. I've thought emergency overheat handled by firmware of some helper cpus, no? Just like DVFS. So it could be quite fast, no? As for draining caps, er, I've always thought power and clock gating happens inside IC itself and in this case large caps of VRM are out of equation, no? Isn't it possible to power/clock gate everything that could be gated under emergency? I also could remember overheat margins tend to be configurable (could vendor change it?). Maybe someone left too small margins? (I've seen quite some nvidias fried due to overheat, but it's hallmark of certain families of nvidias - high-end cards of older generations were running on the edge of their thermal design so they were really prone to failures).
      Maybe. They may did not want to put a lot of effort in emergency shutdown optimization, a feature that is not that important. And i am pretty sure this is something not tested in QA as it is no advertised future, just something the GPU vendors implemented to give a better user experience. I also think it is difficult to set good temperature borders for all chips and GPU designs as they are all a little bit different (as far as i understand this). And you do not want to shutdown the card much to early, as this might frustrate users as well.
      To reliably determinate whether AMD or Nvidia GPU's and what chips and designs are easier to kill without (proper) cooling testing would be required. But as you'd have to fry a lot of GPU's in the Progress, i do not think someone will do this. Although it would be a very different benchmark and kind of interesting, too.

      Comment


      • #33
        Well these cards certainly did not have working overheat protection. Both cards had plenty of time to heat up (the second one about an hour, as told before). Or to put it another way: during my testing, and with a sample size of 2 and a very benign temperature profile, the failure rate was 100 %.

        Comment

        Working...
        X