Announcement

Collapse
No announcement yet.

Some FreeBSD Users Are Still Running Into Random Lock-Ups With Ryzen

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Originally posted by shmerl View Post
    Ryzen is very sensitive to power stability. In short, you need UPS to avoid problems.
    Unless you are living in a third world country this shouldn't be an issue. As mentioned by Guest power supplies inside the machine are very good at stabilizing the voltage, even when power goes out in short bursts. If anything, it could be an issue of an unstable power supply which UPS may or may not help.

    Comment


    • #32
      Originally posted by mir3x View Post

      Have you added rcu_nocbs=0-15 to kernel commandline ?
      If no, then of course it will hang.
      It was a reply to the post of monraaf, I wanted to outline that it is not fixed upstream by the kernel 4.13. I do use linux-ryzen-git in Arch Linux and yes, with this fix (rcu_nocbs=0-15) most of the freezes are gone.

      Comment


      • #33
        Originally posted by dungeon View Post

        Bug probably just have something to do with XFR, since disabling C6 also disables XFR that additional turbo speed... that is how this sounds to me.

        Are these non-X Ryzens same affected as these with X?
        For me the lockups mostly happened while the system was idle. Of course this does not rule out XFR, but it wouldn't be my first guess.
        Also, I'm curious. My mainboard firmware setup didn't mention that disabling C6 states also disables XFR. Do you have a source for this?

        Edit: I've quickly run a few benchmarks with and without C6 enabled. Disabling C6 states indeed limits the maximum boost speed that can be achieved.
        Last edited by soulsource; 23 January 2018, 03:40 AM.

        Comment


        • #34
          Originally posted by soulsource View Post
          As my original Ryzen CPU had shown the random segfault issue, I got a replacement chip via AMD RMA. Since then there are no more segfaults, but as other users, I've had random freezes when the system was idling.
          The CONFIG_RCU_NOCB_CPU didn't have any effect on my machine, but disabling C6 states via the firmware settings made the computer run rock solid. Not a single freeze since I changed this four months ago.
          Please, can you check how you did that and let me know please?

          I have the idling problem, what I do to prevent the computer from freezing is to run a session of an emulator on the background to keep the computer busy, that way it never crashes. (I'm using Kernel 4.14.14 ATM)

          Comment


          • #35
            Originally posted by Tomin View Post
            I get random lockups when writing stuff with gvim which is very annoying when you happen to be computer science major whose favourite editor is (g)vim... I upgraded to the latest bios (F10 for my Gigabyte GA-AB350-Gaming motherboard; for some reason bios versions newer than F4 have issues with bcache setups) which didn't fix the lockups. Now I've disabled C-state power saving (I don't remember the exact setting) and it seems to be stable. I'm running Fedora 27 with stock kernels.

            The CPU has also the segfault bug when using many threads, but I can't yet replace it because I really need that cpu for now. Maybe in a month or two I can RMA it.
            I have the same motherboard, and the exact same problem, I'm running firmware F10, but no idea how you disable C-state, can you please let me know how?

            Please, please, please :-)

            Comment


            • #36
              Originally posted by JPFSanders View Post

              I have the same motherboard, and the exact same problem, I'm running firmware F10, but no idea how you disable C-state, can you please let me know how?

              Please, please, please :-)
              I think it is called Global C-state control and it is under Advanced Frequency Settings if I remember correctly. I don't know if this does something else as well. I've also disabled some other power saving settings because of other reasons that are not related to any issues at all. It's pretty annoying that manufacturers don't update their manuals when updating BIOS. Current version says that there is option C6 Mode but that does not exist on my computer.

              Comment


              • #37
                Originally posted by starshipeleven View Post
                Slight OT, but I'm looking for a mini-itx Ryzen board that supports ECC AND has an onboard Displayport (hoping the new Ryzen-based APUs also support ECC too), and your board is one of the very few that seem to fit the bill. Is there any options about ECC in your bios/UEFI settings?

                Could you look in dmesg for lines about edac with dmesg | grep EDAC (should post something like this for a user that does have ECC ram installed anyway) https://www.reddit.com/r/Amd/comment...ryzen/dhtm6az/
                At first I thought I had misspelled my motherboard model but that's not the case. This is a full size ATX motherboard not Mini-ITX (or even Micro-ATX) and it has DVI-D and HDMI connectors but no DP. Anyway there is no ECC options in BIOS at all. I don't have ECC ram myself and that dmesg command just prints:
                Code:
                [    0.062015] EDAC MC: Ver: 3.0.0

                Comment


                • #38
                  Originally posted by Tomin View Post
                  At first I thought I had misspelled my motherboard model but that's not the case. This is a full size ATX motherboard not Mini-ITX (or even Micro-ATX) and it has DVI-D and HDMI connectors but no DP. Anyway there is no ECC options in BIOS at all. I don't have ECC ram myself and that dmesg command just prints:
                  Code:
                  [ 0.062015] EDAC MC: Ver: 3.0.0
                  Ops, my fault. Thanks for answering me anyway.
                  Yeah as you noticed I read your board name wrong. This is the one I thought was yours (and is the one I'm looking for) GA-AB350N-Gaming http://www.gigabyte.fi/Motherboard/G...WIFI-rev-10#kf

                  Your board does not even state it has ECC support, so whatever.

                  Comment


                  • #39
                    Originally posted by shmerl View Post
                    Ryzen is very sensitive to power stability. In short, you need UPS to avoid problems.
                    I do use a UPS and I still have the problem. At least one other user has said the same to me.

                    Originally posted by monraaf View Post
                    No, at least not the soft-lockups. Those were definitely caused by a kernel bug as they were also visible on large SPARC machines.
                    Lock ups can happen for all sorts of reasons. Admittedly I haven't tried a new kernel in a couple of months but I'd be very surprised if it is fixed, especially given that none of the other users have reported it fixed and we're even seeing reports from other operating systems.
                    Last edited by Chewi; 22 January 2018, 11:28 AM.

                    Comment


                    • #40
                      Originally posted by monraaf View Post

                      This was fixed in 4.13rc-something. These soft-lockups have been visible on large SPARC machines as well. This issue is not specific to Ryzen/Threadripper.

                      My office has one Threadripper machine which had this issue as well. With the updated kernel, the problem went away.

                      The machine is running 24/7 and is used for heavy number crunching.
                      You mean, The workaround is harder to apply since that patch from some Intel+IBM dude in kernel 4.13rc.
                      It use to have an option to apply that RCU thing to all cores (the same thing that the cmdline rcu parameter but it was detecting the number of thread and applying it to all) that distribution maintainers could just enable in there kernel build, some IBM dude removed that option in response from some other Intel patch, said that it was useless... And now since 4.13rc your forced to add a cmdline too, with make the workaround impossible to apply distribution wide as they can't get how much thread you will have in your box.

                      I was using that workaround since 4.11ish, it stop working with 4.13rc (I did not see that patch from Intel come so I was sure it was linked to the experimental AMD DC stuff I was toying with.)

                      Thanks again Intel/IBM... https://cateee.net/lkddb/web-lkddb/R...B_CPU_ALL.html
                      Last edited by RavFX; 22 January 2018, 12:58 PM.

                      Comment

                      Working...
                      X