Announcement

Collapse
No announcement yet.

Ryzen-Test & Stress-Run Make It Easy To Cause Segmentation Faults On Zen CPUs

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • See https://www.phoronix.com/forums/foru...244#post969244

    Comment


    • Does anybody knows if the bug is just present on the full configured models (1700, 1700x and 1800x) or it's also prensent on the six cores models and on models with not SMT like the 4 cores 1300x and 1200?

      Comment


      • Fwiw I have just gotten a replacement 1700 from AMD (to replace due to MCE issues and segfaults) the new one "seems" fine and was made week 30 this year (UA 1730)

        Looks like (from information on AMD forums) anything newer than 1725 looks to be "fixed"

        Comment


        • I have my 1700X from launch, well few days later, i got it on 7th of March. I ran that ryzen kill script for a hour with no issues. never seen the freeze bug either.
          motherboard is asus prime x370 pro. i have my cpu under oc of 3.9ghz with voltage offset +0.1750V (offset is used to make the cpu go to idle clocks when not needed)

          Comment


          • I have Ryzen 1700, stock speed, never overclocked, bought on 21st of July. I won't remove the cooler to see the production date.
            64Gb RAM, 2933 Mhz.
            I use Debian, latest version (9.1), KDE Plasma desktop.

            I've run the kill-ryzen.sh script, and I can reproduce the segfault error with SMT on. But if I switch SMT off, it no longer appears.
            As I don't use SMT, and this is my main computer, I'll keep the processor for now. I'll probably send it for replacement eventually, if no permanent solution is found.

            The results of my tests, run on 25th of August, are:
            -run 1h 37m, no error, SMT off
            -run 22 minutes, SMT on, 1 thread (loop-4) quickly gave an error "TIME TO FAIL: 263 s" (no additional error message shown)
            -run 6 minutes, SMT on, 1 thread (loop-0) quickly gave an error "TIME TO FAIL: 104 s" (also segfault error message appeared)
            -run 16 minutes, no error, SMT off

            For the next tests I kept SMT off, but I increased the number of threads by editing the script:
            -run 40 minutes, no error, SMT off, 32 threads
            -run 1h 32m, no error, SMT off, 64 threads
            -run 39 minutes, no error, SMT off, 16 threads

            Comment


            • I ran more tests and was able to reproduce the error with SMT off as well, after a 35 minutes run.
              [KERN] Aug 26 09:44:06 host kernel: as[31164]: segfault at 7c00000077 ip 0000007c00000077 sp 00007ffca8334bf1 error 14 in x86_64-linux-gnu-as[55de2995f000+5b000]
              The loop didn't appear to die though (no ""TIME TO FAIL: " message).

              Comment

              Working...
              X