Linux Fixes Hosts Randomly Rebooting During Virtualization With Ryzen 7000/8000 CPUs

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • MastaG
    Senior Member
    • May 2012
    • 431

    #11
    Just settle down with 6.12 and all of your troubles belong to the past.

    Comment

    • yump
      Senior Member
      • Aug 2021
      • 505

      #12
      Wait, Epyc 4004? I thought that was just client AM5 Ryzen 7000 with a different part number.

      Patch is:

      Code:
      +    /*
      +     * These Zen4 SoCs advertise support for virtualized VMLOAD/VMSAVE
      +     * in some BIOS versions but they can lead to random host reboots.
      +     */
      +    switch (c->x86_model) {
      +    case 0x18 ... 0x1f:
      +    case 0x60 ... 0x7f:
      +        clear_cpu_cap(c, X86_FEATURE_V_VMSAVE_VMLOAD);
      +        break;
      +    }​+ break;​
      Anybody with an actual Epyc 4004 feel like checking?

      Comment

      • Espionage724
        Senior Member
        • Sep 2024
        • 325

        #13
        Originally posted by dayone View Post
        This Post is about nested VMs aka VMs running in VMs. Its an Software issue and not a hardware issue like with Intel CPUs.

        It's sounding pretty hardware to me if VM on an AMD CPU is causing a host-side reboot, especially if it's a bug/not correct behavior

        The whole point of userspace or whatever from Windows XP -> Vista and other OSs is to not have software crashing hardware. Software being able to trigger hardware to cause a reboot sound pretty busted and I haven't heard of anything comparable on Intel yet.

        With that above code patch; wtf does that even mean? That sounds like a platform-specific fix that's implying AMD allows their CPUs alongside their own AGESA updates to somehow exist on broken hardware. Yeah that's not inspiring confidence in a chain of stability
        Last edited by Espionage724; 17 November 2024, 06:54 PM.

        Comment

        • Forge
          Senior Member
          • Apr 2008
          • 183

          #14
          Originally posted by Espionage724 View Post


          It's sounding pretty hardware to me if VM on an AMD CPU is causing a host-side reboot, especially if it's a bug/not correct behavior

          The whole point of userspace or whatever from Windows XP -> Vista and other OSs is to not have software crashing hardware. Software being able to trigger hardware to cause a reboot sound pretty busted and I haven't heard of anything comparable on Intel yet.

          With that above code patch; wtf does that even mean? That sounds like a platform-specific fix that's implying AMD allows their CPUs alongside their own AGESA updates to somehow exist on broken hardware. Yeah that's not inspiring confidence in a chain of stability
          Zen 4 aka Ryzen 7000/8000 does not support VMLOAD/VMSAVE. Never has. The hardware is there, because it's shared silicon with EPYC 4004, but it's not supposed to be enabled/advertised. It is, due to motherboard OEMs just enabling everything and shipping. Easy fix kernel side, means you don't need a firmware update.

          In a more perfect world, we'd probably be able to dump out some supporting microcode/firmware enablement from a board that properly supports EPYC 4004 and enable VMLOAD/VMSAVE for everyone on Zen4, but it's really not worth the effort.

          I was affected by this, replaced quite a bit of hardware trying to diagnose. Unlike some, I won't be making any melodramatic edicts about it.

          Comment

          • theriddick
            Senior Member
            • Oct 2015
            • 1733

            #15
            I think I've experienced this a few times and was like WTF... thought I was experiencing some sort of PSU fault!

            I've got the VM flags enabled in bios and don't use a VM currently but have experienced this even when a guest vm isn't running so perhaps it happens when support is enabled even if unused as well?

            Comment

            • Espionage724
              Senior Member
              • Sep 2024
              • 325

              #16
              Originally posted by Forge View Post

              Zen 4 aka Ryzen 7000/8000 does not support VMLOAD/VMSAVE. Never has. The hardware is there, because it's shared silicon with EPYC 4004, but it's not supposed to be enabled/advertised. It is, due to motherboard OEMs just enabling everything and shipping. Easy fix kernel side, means you don't need a firmware update.
              So, AMD being cheap and leaving vendors to be indirectly incompetent, along with end-users having the results after paying for all the hardware involved? I'm still not seeing a chain of confidence

              What is newer hardware doing re-using old hardware and cutting out a feature on the old hardware? And why would someone buy that?

              Comment

              • sophisticles
                Senior Member
                • Dec 2015
                • 2547

                #17
                Originally posted by intelfx View Post
                This issue is about nested virtualization, i.e. VMSAVE/VMLOAD in the guest. You won't ever experience this issue unless you really go out of your way (and get unlucky with the BIOS/ucode).
                Anyone that even thinks about running a VM within a VM deserves to have his system shutdown!!!

                Comment

                • Forge
                  Senior Member
                  • Apr 2008
                  • 183

                  #18
                  Originally posted by Espionage724 View Post

                  So, AMD being cheap and leaving vendors to be indirectly incompetent, along with end-users having the results after paying for all the hardware involved? I'm still not seeing a chain of confidence

                  What is newer hardware doing re-using old hardware and cutting out a feature on the old hardware? And why would someone buy that?
                  Do you have any idea what you are talking about? What “old hardware”? Ryzen 7000/8000 and EPYC 4000 are all contemporary and run on the exact same silicon.

                  Comment

                  • intelfx
                    Senior Member
                    • Jun 2018
                    • 1083

                    #19
                    Originally posted by theriddick View Post
                    I think I've experienced this a few times and was like WTF... thought I was experiencing some sort of PSU fault!

                    I've got the VM flags enabled in bios and don't use a VM currently but have experienced this even when a guest vm isn't running so perhaps it happens when support is enabled even if unused as well?
                    No, it doesn't and you haven't. Unless you're using not just VMs but nested VMs, whatever you are affected by is not this problem.

                    Comment

                    • intelfx
                      Senior Member
                      • Jun 2018
                      • 1083

                      #20
                      Originally posted by sophisticles View Post

                      Anyone that even thinks about running a VM within a VM deserves to have his system shutdown!!!
                      You should run for the word-twisting and mental gymnastics contest. Easy podium for you there, mate.

                      Comment

                      Working...
                      X