Announcement

Collapse
No announcement yet.

Ryzen-Test & Stress-Run Make It Easy To Cause Segmentation Faults On Zen CPUs

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #61
    Can someone tell me why this doesn't happen in windows?

    Comment


    • #62
      Michael thanks for bringing this topic to attention.
      garegin it does. It has been observed with test tools as well as "productive" MSVC workloads.

      Comment


      • #63
        Originally posted by debianxfce View Post
        Hardware vendors do test their products better than open source software community.
        Really? i think it is just your opinion you have no prove whatsoever for this claim.
        Phantom circuit Sequence Reducer Dyslexia

        Comment


        • #64
          Originally posted by bridgman View Post

          Is "John" in this case me ? If so, I don't think I promised (or even "said") that I would provide an update, just that I would make sure the info and concerns were getting to the right people. I did that, and posted back to confirm it.

          If that isn't how you read things could you try to point me to the threads ? Thanks...
          "
          bridgman Why was the initial message you were answering deleted?
          I don't think it was deleted - still shows for me as post #314, can you check ?
          amdmatt once said there is no need to open furhter tickets, what to do now?
          Will ask, stay tuned."
          I can't type without sounding pissed off, I'm sorry, however "will ask, stay tuned" seems like you would provide information back as to the state of affairs there. Just letting you know they keep waiting for any response.

          That said, you're my favorite AMD guy and you've always chatted to us guys which I appreciate. I don't want you to think I think poorly of you just that AMD could ease some minds with even a few small words. I'm happy with my 1800x, not everyone is so lucky and I try to help people it's in my nature, so I apologize for that too., and getting drunk and posting CRAZY youtube comments... however those weren't about AMD lol.

          Comment


          • #65
            Originally posted by Beherit View Post
            It doesn't happen when using Microsoft Windows. Which is the diametrical opposite of open source.
            Originally posted by garegin View Post
            Can someone tell me why this doesn't happen in windows?
            The problem does happen in Windows Subsystem for Linux (WSL). This was confirmed in the AMD community forums and in the Gentoo forums.

            Comment


            • #66
              Originally posted by juno View Post
              garegin it does. It has been observed with test tools as well as "productive" MSVC workloads.
              Can you please paste a link to the report? I've search MSDN forums ever since this was first reported with nothing on this. All I've seen are rumours on the Gentoo forums not saying much else but "a friend of my friend's brother saw a post somewhere he can't remember, that..."

              Originally posted by chithanh View Post

              The problem does happen in Windows Subsystem for Linux (WSL). This was confirmed in the AMD community forums and in the Gentoo forums.
              Surely, you must be aware that's because of the Linux core in WSL and nothing with Microsoft? Fair enough, I'll be more specific: Please point me to an article which shows how to reproduce this bug in Windows, and that's Windows only, not any emulated environment or compatibility layer.

              Comment


              • #67
                Originally posted by debianxfce View Post
                That proves more that problem is in the open source software.
                In the open source software? Which exactly?
                Is it in the Linux kernel? But WSL/FreeBSD/DragonFlyBSD do not use the Linux kernel.
                Is it in glibc? FreeBSD/DragonFlyBSD use different libc.
                Is it in gcc? clang crashes too.

                Originally posted by Beherit View Post
                Surely, you must be aware that's because of the Linux core in WSL and nothing with Microsoft? Fair enough, I'll be more specific: Please point me to an article which shows how to reproduce this bug in Windows, and that's Windows only, not any emulated environment or compatibility layer.
                WSL is not an emulated environment or compatibility layer. It is a subsystem of the Windows kernel which implements the Linux ABI and which was written entirely by Microsoft. (Its origins are reportedly the ill-fated "Project Astoria" which implemented Android app compatibility for Windows Mobile)

                Comment


                • #68
                  Thank you Michael for your work. Phoronix was our only hope in this issue - I am also affected with a 1700X and I am in the process of RMA.

                  @chithanh: debianxfce is a known troll in phoronix forums, don't feed it. Just ignore him.

                  @Beherit: Failing with mingw-w64 under Windows will be OK for you or you want MSVC workload? I am in the process of testing the build of ffmpeg under Windows with both mingw-w64 and msvc.

                  Comment


                  • #69
                    Originally posted by chithanh View Post
                    It is a subsystem of the Windows kernel which implements the Linux ABI
                    And it's also a compatibility layer.

                    Originally posted by https://en.wikipedia.org/wiki/Windows_Subsystem_for_Linux
                    Windows Subsystem for Linux (WSL) is a compatibility layer for running Linux binary executables (in ELFformat) natively on Windows 10.
                    See?

                    Originally posted by https://en.wikipedia.org/wiki/Compatibility_layer
                    In software engineering, a compatibility layer is an interface that allows binaries for a legacy or foreign system to run on a host system.
                    You'll also find WSL listed as an example of a compatibility layers a bit further down the article.

                    Still not convinced?

                    Originally posted by https://blogs.msdn.microsoft.com/wsl/2016/04/22/windows-subsystem-for-linux-overview/
                    Pico processes and drivers provide the foundation for the Windows Subsystem for Linux, which runs native unmodified Linux binaries by loading executable ELF binaries into a Pico process’s address space and executes them atop a Linux-compatible layer of syscalls.
                    I understand what you're trying to say, that closed source is to blame for this, as the bug appears in something fully developed in-house by Microsoft. But whether you're right or wrong has little to do with the question to why this bug hasn't been reported by those Ryzen users who use native Windows compilers. I've read confirmed reports by those using Linux, FreeBSD, gcc, clang.. but none from MSVC devs (and what other Windows compilers are used these days).

                    (I'm also curious if someone managed to run a Ryzen based Hackintosh, if segfaults occur in macOS as well)

                    Comment


                    • #70
                      Here are my (prelimininary) results:

                      phoronix stress run

                      PTS_CONCURRENT_TEST_RUNS=4 TOTAL_LOOP_TIME=60 phoronix-test-suite stress-run build-linux-kernel build-php pgbench redis


                      My system seems to be quite stable, so far. The only crashes I find via dmesg are a bunch from php's configure script and I'm not sure if they are supposed to crash. But I'm typing this message on a system load of 50 and spotify plays without any hiccups.

                      kill-ryzen.sh
                      It's running for a little while now without any problems so far.

                      Edit: No problem after 1.5h++

                      Would you, who get an unstable system, maybe try my kernel? I'm using the official kernel from AMD's ROCm repo, albeit on ArchLinux:

                      https://github.com/RadeonOpenCompute/ROCm

                      It's maybe worth mentioning, that I had to manually enter my memory's timings to get it running faster than stock speeds on ASUS PRIME B350M-A. AFAIR, I had to increase the voltage to get it working. But higher than some 2666 MHz wasn't possible.

                      Overview:
                      Board ASUS PRIME B350M-A
                      Memory CMK32GX4M2B3200C16, two times
                      CPU AMD Ryzen 7 1800X Eight-Core Processor
                      microcode 0x800111c
                      kernel 4.11.0-kfd-compute-rocm-rel-1.6-115
                      Last edited by oleid; 08-05-2017, 04:52 AM. Reason: Final kill-ryzen result

                      Comment

                      Working...
                      X