Announcement

Collapse
No announcement yet.

AMD Sends In Bits Of New Hardware Blocks For Linux 5.18 Radeon Updates

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • AMD Sends In Bits Of New Hardware Blocks For Linux 5.18 Radeon Updates

    Phoronix: AMD Sends In Bits Of New Hardware Blocks For Linux 5.18 Radeon Updates

    Building off last week's Radeon graphics driver updates for Linux 5.18 that included introducing AMDKFD CRIU and enabling FreeSync Video Mode by default, Friday evening brought a second batch of feature updates for this next kernel version...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    Code:
    Mario Limonciello (4):
        drm/amd: smu7: downgrade voltage error to info
        drm/amd: Check if ASPM is enabled from PCIe subsystem
        drm/amd: Refactor `amdgpu_aspm` to be evaluated per device
        drm/amd: Use amdgpu_device_should_use_aspm on navi umd pstate switching
    Myself and others have been having a trouble using GPU-passthrough with the Navi 6000 series cards. Please let this be related to our woes and fix resetting the card on VM restart....
    Last edited by akarypid; 19 February 2022, 06:07 PM.

    Comment


    • #3
      Originally posted by akarypid View Post
      Code:
      Mario Limonciello (4):
      drm/amd: smu7: downgrade voltage error to info
      drm/amd: Check if ASPM is enabled from PCIe subsystem
      drm/amd: Refactor `amdgpu_aspm` to be evaluated per device
      drm/amd: Use amdgpu_device_should_use_aspm on navi umd pstate switching
      Myself and others have been having a trouble using GPU-passthrough with the Navi 6000 series cards. Please let this be related to our woes and fix resetting the card on VM restart....
      Our GPUs do not support FLR. You'll need to use secondary bus resets if you want to use a generic PCI reset mechanism.

      Comment


      • #4
        Originally posted by agd5f View Post

        Our GPUs do not support FLR. You'll need to use secondary bus resets if you want to use a generic PCI reset mechanism.
        The problem is that https://github.com/gnif/vendor-reset does not support 6000 series GPUs.

        It also doesn't help that apparently some 6000 series GPUs *do* work, and it was even posted in a Level1Techs video, leading people to think that a 6000 series would work. I myself fell victim to that (and bought a 6700xt thinking FLR works).

        Comment


        • #5
          Originally posted by akarypid View Post

          The problem is that https://github.com/gnif/vendor-reset does not support 6000 series GPUs.

          It also doesn't help that apparently some 6000 series GPUs *do* work, and it was even posted in a Level1Techs video, leading people to think that a 6000 series would work. I myself fell victim to that (and bought a 6700xt thinking FLR works).
          FLR is not supported on any of our GPUs. You need to use secondary bus resets (SBR). SBR is a generic reset mechanism. SBR is what Level1Techs used. You don't need a vendor specific reset.

          Comment


          • #6
            Originally posted by agd5f View Post

            FLR is not supported on any of our GPUs. You need to use secondary bus resets (SBR). SBR is a generic reset mechanism. SBR is what Level1Techs used. You don't need a vendor specific reset.
            Please don't get my hopes up. I had tried a (slightly modified) version of https://www.reddit.com/r/FPGA/commen...d_pci_express/

            Are you saying you expect this to work? Here is the script I briefly tried (with no luck):

            Code:
            #!/bin/bash
            
            # FROM: https://www.reddit.com/r/FPGA/comments/c68ygd/useful_scripts_for_linuxhosted_pci_express/
            
            dev=$1
            
            if [ -z "$dev" ]; then
            echo "Error: no device specified"
            exit 1
            fi
            
            if [ ! -e "/sys/bus/pci/devices/$dev" ]; then
            dev="0000:$dev"
            fi
            
            if [ ! -e "/sys/bus/pci/devices/$dev" ]; then
            echo "Error: device $dev not found"
            exit 1
            fi
            
            port=$(basename $(dirname $(readlink "/sys/bus/pci/devices/$dev")))
            
            if [ ! -e "/sys/bus/pci/devices/$port" ]; then
            echo "Error: device $port not found"
            exit 1
            fi
            
            echo "Removing $dev..."
            
            echo 1 > "/sys/bus/pci/devices/$dev/remove"
            
            echo "Performing hot reset of port $port..."
            
            bc=$(setpci -s $port BRIDGE_CONTROL)
            
            echo "Bridge control:" $bc
            
            bc1=$(printf "%04x" $(("0x$bc" | 0x40)))
            
            echo "Bridge control 1: $bc1
            
            setpci -s $port BRIDGE_CONTROL=$bc1
            sleep 0.01
            setpci -s $port BRIDGE_CONTROL=$bc
            sleep 0.5
            
            echo "Rescanning bus..."
            
            echo 1 > "/sys/bus/pci/devices/$port/rescan"
            If there is a chance this might work, I will look up the PCI spec (to make sure there's no bug in what bridge control value is used) and then spend a day trying it again (along with various BIOS options on the host in case they matter).

            If this works I'm going to have to find a way to buy you a beer or coffee (I think there are web sites for that).

            Thanks for the pointer though. I sort of skimmed over that approach as I have not seen it used in VFIO forums...

            Comment


            • #7
              I don't recall the PCI registers off hand, but you basically reset the link on the bridge above the device. Note that the GPU is not a single device from a PCI perspective, it's actually several endpoints (GPU, USB, Audio, etc.). You'll need to do a secondary bus reset on the PCI root port that the GPU is connected to. SBR will reset all of the endpoints behind the bridge where you issue the reset.

              Comment

              Working...
              X