Linux 6.5 Should Spend Less Time Waiting On PCIe Devices
The PCI subsystem updates have been submitted for the ongoing Linux 6.5 development.
With the PCI/PCIe changes for Linux 6.5 there isn't any particular set of exciting features but a number of low-level improvements to the kernel in this area. In particular, there's been a few patches this cycle for reducing the amount of time spent waiting on PCI(e) devices.
Thanks to Intel's Mika Westerberg, the pci_bridge_wait_for_secondary_bus() kernel function will spend less time waiting for slow PCIe links to be established. Mika explained in the patch:
There's also been another change brought on by PCIe link training failures that in particular have been commonly happening with an ASMedia ASM2824 controller. Maciej Rozycki explained there:
So in turn the link training will first happen at 2.5GT/s and then attempt a higher rate to avoid those link training failures plaguing some controllers.
Separately but still part of this PCI pull request, Linux 6.5 though has introduced a new delay if you happen to carry out a PCI function-level reset (FLR) with the Solidigm P44 Pro NVMe SSD. A KVM hang can currently happen when using a Soldigim P44 Pro NVMe SSD passed to a guest virtual machine via IOMMU and then rebooting the KVM guest. But adding a 250ms delay after a PCI FLR for this particular NVMe solid-state drive resolves the issue. Interestingly this issue also occurred back with select Intel DC SSDs. Intel spun off their NAND and SSD business that ultimately became Solidigm and it seems this issue continues to affect some new devices.
More details on other PCI changes for Linux 6.5 can be found via the pull request.
With the PCI/PCIe changes for Linux 6.5 there isn't any particular set of exciting features but a number of low-level improvements to the kernel in this area. In particular, there's been a few patches this cycle for reducing the amount of time spent waiting on PCI(e) devices.
Thanks to Intel's Mika Westerberg, the pci_bridge_wait_for_secondary_bus() kernel function will spend less time waiting for slow PCIe links to be established. Mika explained in the patch:
"With slow links (<= 5GT/s) active link reporting is not mandatory, so if a device is disconnected during system sleep we might end up waiting for it to respond for ~60s slowing down resume time. PCIe spec r6.0 sec 6.6.1 mandates that the system software must wait for at least 1s before it can determine the device as broken device so use the minimum requirement for slow links and bail out if we do not get reply within 1s. However, if the port supports active link reporting we can continue the wait following what we do with the fast links.
This should make system resume time faster for slow links as well while still following the PCIe spec."
There's also been another change brought on by PCIe link training failures that in particular have been commonly happening with an ASMedia ASM2824 controller. Maciej Rozycki explained there:
"A PCIe link training phenomenon where a pair of devices both capable of operating at a link speed above
2.5GT/s seems unable to negotiate the link speed and continues training indefinitely with the Link Training bit switching on and off repeatedly and the data link layer never reaching the active state."
So in turn the link training will first happen at 2.5GT/s and then attempt a higher rate to avoid those link training failures plaguing some controllers.
Separately but still part of this PCI pull request, Linux 6.5 though has introduced a new delay if you happen to carry out a PCI function-level reset (FLR) with the Solidigm P44 Pro NVMe SSD. A KVM hang can currently happen when using a Soldigim P44 Pro NVMe SSD passed to a guest virtual machine via IOMMU and then rebooting the KVM guest. But adding a 250ms delay after a PCI FLR for this particular NVMe solid-state drive resolves the issue. Interestingly this issue also occurred back with select Intel DC SSDs. Intel spun off their NAND and SSD business that ultimately became Solidigm and it seems this issue continues to affect some new devices.
More details on other PCI changes for Linux 6.5 can be found via the pull request.
3 Comments