Announcement

Collapse
No announcement yet.

Ryzen Stability Issues Are Still Affecting Some FreeBSD Users

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Originally posted by shmerl View Post

    What are your normal idle temperatures with 2700X? For me it's rather high (jumped to 39°C from 30°C after switching from 1700X to 2700X), and that's with double fan Noctual cooler. I wonder if something got messed up or it's supposed to be that high?
    Should be the same really (with default coolers of course), just offset policy is different (look downthere)

    Threadripper 1950X 43°C 27°C 70°C
    Threadripper 1920X 43°C 27°C 70°C
    Ryzen 7 2700X 38°C 10°C 48°C
    Ryzen 5 2600X 38°C 10°C 48°C
    Ryzen 7 1800X 38°C 20°C 58°C
    Ryzen 7 1700X 38°C 20°C 58°C
    Ryzen 7 1700 38°C 0°C 38°C
    http://www.guru3d.com/articles-pages...-review,7.html

    But again it might be different really when you use your own and same one, dunno how much these default coolers differ on 1700X vs 2700X. These likely differ since it is 95W vs 105W TDP that needs to be cooled, while user get same temps with that default ones

    So, it is supposed to be higher with same cooler used - just look at TDP diff
    Last edited by dungeon; 24 April 2018, 04:22 AM.

    Comment


    • #22
      Originally posted by Leopard View Post

      No.

      Same Ryzen's and same psu's are working on Windows without problems.

      Problem is Linux specific and as a responsible hardware vendor AMD should have solve this issue long ago.

      ​​​
      I had this occur to me on Linux once or twice a week for about half a year. I expected the fix to come with some new Linux version, but then I decided to update my BIOS and now the issue is fixed. Maybe something else changed approximately the same time as i updated my BIOS, but I doubt it. I have a R5 1600 and an ASUS B350 mATX motherboard.

      Comment


      • #23
        Originally posted by VikingGe View Post
        You sure it's actually your CPU locking up in this particular case and not your GPU? Because that sadly happens a lot with DXVK on RADV.

        That said, I got my 2700X today, no issues compiling stuff or anything, I really hope it's stable.
        I don't know for sure, but the traces I've captured so far seem to be identical to the idle crashes. I'll do some more testing with this in mind.

        Comment


        • #24
          Originally posted by Brisse View Post
          Aaah, the PSU. That mysterious component that non electrical engineers always blame when they have no idea what the actual problem is.
          I did not blame the PSU, I just said that it is not excluded from the short list of potential causes.

          And it is known that a number of PSUs have problems with C6 states as it puts a lot of stress on them, and when a PSU goes bad that will become apparent first in stress situations.

          Originally posted by Leopard View Post
          Same Ryzen's and same psu's are working on Windows without problems.

          Problem is Linux specific and as a responsible hardware vendor AMD should have solve this issue long ago.​​​
          That is wrong. Windows users are seeing similar problems too, although less frequently. And even if it were different, that would also not be proof that Ryzen is defective.
          If Linux exercises low power states in a different way compared to Windows, this can push a bad PSU which appears to work fine in Windows over the edge.

          Originally posted by dwagner View Post
          Wow, the "it's the power supply unit!" AND the "it's Linux!" rumor in two consecutive posts... *sigh*.
          Do note that I placed no blame anywhere, I just listed the things that can be responsible still and have not been reasonably excluded from the list.

          Originally posted by dwagner View Post
          Quite obviously, there are many people now who got an exchange Ryzen CPU from AMD, and then no longer have the stability issues they had before - using the same power supply and operating system, that is.
          If it is a marginality problem, then later Ryzens with better silicon characteristics may continue to run "fine" (ie. not crash the whole system).

          Originally posted by dwagner View Post
          So no, AMD definitely has an issue, and while they seem able and willing to test exemplars for not having that issue and sending them to users who are affected by the issue, there is still no indication that the root cause has ever been found or fixed.
          Nor a particular component of the system being singled out as the cause.

          Comment


          • #25
            Originally posted by schwarzman View Post

            As far as I know the mainline kernel doesn't know about the 10° offset for the Ryzen 2xxx - did you check that?
            Yeah, that's what I suspected. It's most likely that offset.

            Comment


            • #26
              Originally posted by chithanh View Post
              I did not blame the PSU, I just said that it is not excluded from the short list of potential causes.
              Wasn't really meant to you personally. Just an interesting observation I've made while lurking on different forums.

              Originally posted by chithanh View Post
              And it is known that a number of PSUs have problems with C6 states as it puts a lot of stress on them, and when a PSU goes bad that will become apparent first in stress situations.
              Could you please explain this in electrical engineering terms instead of ambiguous layman terms?

              Comment


              • #27
                Brisse
                Some explanation about PSUs and C6 state here:
                https://www.pcper.com/news/Cases-and...l-Haswell-CPUs

                Basically, C6 causes the minimum load on the 12 V rail to drop below 0.05 A sometimes. Regulating the 12 V rail to stay within tolerance across a large range of current is challenging, and if a capacitor on the PSU goes bad, may result in undervoltage when the load increases sharply.

                This may explain why it happens only at idle (CPU goes in and out of C6 a lot) and has not yet been observed on Epyc systems (those usually have server PSUs and not cheap consumer parts).

                Edit: In even more electrical engineering terms, this is called transient response.

                The first one is the series resistance of the output capacitor. If that resistance is too high, then the load step creates a large voltage deviation before the control loop can respond.
                http://www.electronicdesign.com/test...pply-bandwidth

                And this specifically increases with the age of the capacitor:

                The output degradation is typically measured by an increase in ESR (Equivalent Series Resistance) and decrease in the capacitance value over long periods of use even under nominal operating conditions.
                https://www.phmsociety.org/sites/phm...hmc_10_030.pdf
                Last edited by chithanh; 24 April 2018, 10:35 AM.

                Comment


                • #28
                  Regarding C6 - it has nothing to do with faulty PSU and Haswell like support. It's a bug in Ryzen 1 series. I have exactly same PSU, and Ryzen 2700X works without any issues with C6 enabled both for package and cores. On exactly same PSU and board, my Ryzen 1700X was freezing, unless I disabled package C6 through "Power supply common current idle" setting in the firmware.
                  Last edited by shmerl; 24 April 2018, 10:48 AM.

                  Comment


                  • #29
                    chithanh I'm impressed. You get a gold star for that explanation

                    Comment


                    • #30
                      Originally posted by Djhg2000 View Post
                      My 2400G still locks up on Linux. The most reliable trigger seems to be running Far Cry 4 in WINE with DXVK. It can happen at idle as well, but it can survive days or weeks at idle while the above case usually triggers a crash within seconds.

                      Curiously all of the traces I've been able to capture points to a general protection fault in syscall 64 (semget). Anyone with similar logs? I used pstore to have a persistent kernel log area in RAM which usually survives a hard reset.
                      Hi, I have an ASRock AB350 Pro4 motherboard and I'm having freezes each time the GPU is loaded by phoronix gputest or even games. This is on the latest bios P4.70. all components are 2 weeks old. Seasonic 600W PSU.

                      Comment

                      Working...
                      X