Announcement

Collapse
No announcement yet.

Ryzen-Test & Stress-Run Make It Easy To Cause Segmentation Faults On Zen CPUs

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Ryzen-Test & Stress-Run Make It Easy To Cause Segmentation Faults On Zen CPUs

    Phoronix: Ryzen-Test & Stress-Run Make It Easy To Cause Segmentation Faults On Zen CPUs

    With running a number of new Ryzen Linux tests lately, a number of readers requested I take a fresh look at the reported Ryzen segmentation fault issues / bugs affecting a number of many Linux users. I did and still am able to reproduce the problem...

    http://www.phoronix.com/scan.php?pag...est-Stress-Run

  • Constantin
    replied
    I ran more tests and was able to reproduce the error with SMT off as well, after a 35 minutes run.
    [KERN] Aug 26 09:44:06 host kernel: as[31164]: segfault at 7c00000077 ip 0000007c00000077 sp 00007ffca8334bf1 error 14 in x86_64-linux-gnu-as[55de2995f000+5b000]
    The loop didn't appear to die though (no ""TIME TO FAIL: " message).

    Leave a comment:


  • Constantin
    replied
    I have Ryzen 1700, stock speed, never overclocked, bought on 21st of July. I won't remove the cooler to see the production date.
    64Gb RAM, 2933 Mhz.
    I use Debian, latest version (9.1), KDE Plasma desktop.

    I've run the kill-ryzen.sh script, and I can reproduce the segfault error with SMT on. But if I switch SMT off, it no longer appears.
    As I don't use SMT, and this is my main computer, I'll keep the processor for now. I'll probably send it for replacement eventually, if no permanent solution is found.

    The results of my tests, run on 25th of August, are:
    -run 1h 37m, no error, SMT off
    -run 22 minutes, SMT on, 1 thread (loop-4) quickly gave an error "TIME TO FAIL: 263 s" (no additional error message shown)
    -run 6 minutes, SMT on, 1 thread (loop-0) quickly gave an error "TIME TO FAIL: 104 s" (also segfault error message appeared)
    -run 16 minutes, no error, SMT off

    For the next tests I kept SMT off, but I increased the number of threads by editing the script:
    -run 40 minutes, no error, SMT off, 32 threads
    -run 1h 32m, no error, SMT off, 64 threads
    -run 39 minutes, no error, SMT off, 16 threads

    Leave a comment:


  • xpander
    replied
    I have my 1700X from launch, well few days later, i got it on 7th of March. I ran that ryzen kill script for a hour with no issues. never seen the freeze bug either.
    motherboard is asus prime x370 pro. i have my cpu under oc of 3.9ghz with voltage offset +0.1750V (offset is used to make the cpu go to idle clocks when not needed)

    Leave a comment:


  • Apache14
    replied
    Fwiw I have just gotten a replacement 1700 from AMD (to replace due to MCE issues and segfaults) the new one "seems" fine and was made week 30 this year (UA 1730)

    Looks like (from information on AMD forums) anything newer than 1725 looks to be "fixed"

    Leave a comment:


  • sonnet
    replied
    Does anybody knows if the bug is just present on the full configured models (1700, 1700x and 1800x) or it's also prensent on the six cores models and on models with not SMT like the 4 cores 1300x and 1200?

    Leave a comment:


  • rene
    replied
    See https://www.phoronix.com/forums/foru...244#post969244

    Leave a comment:


  • drSeehas
    replied
    Originally posted by rene View Post
    maybe now finally fixed in dragonflybsd? http://lists.dragonflybsd.org/piperm...st/626190.html
    See https://www.phoronix.com/forums/foru...206#post969206

    Leave a comment:


  • rene
    replied
    Originally posted by kemalihsan View Post
    Guys, such artificial torture tests may fail any CPU. Let's not panic.
    The "situation" can also be temporal depending on the updates on BIOS, kernel, gcc, libc, etc...
    By the way, on different Linuxes different results, can it be also scheduler stuff? Do people posting about their scheduler?
    ...
    no, an artificial torture test will not fail any CPU - a stable CPU should be "bug-free" enough that how matter how long you throw math to it it will calculate correctly.

    Leave a comment:


  • rene
    replied
    maybe now finally fixed in dragonflybsd? http://lists.dragonflybsd.org/piperm...st/626190.html

    Leave a comment:

Working...
X