Linux 6.13 KVM Eliminates An "Awful Idea", Many x86_64 Improvements
The KVM changes were merged yesterday for Linux 6.13 in further enhancing the open-source virtualization stack.
Kernel-based Virtual Machine maintainer Paolo Bonzini of Red Hat explained of the biggest change for the KVM feature code in Linux 6.13:
KVM on ARM with Linux 6.13 meanwhile adds PSCIv1.3 SYSTEM_OFF2 support for requesting hibernation, similar to the S4 state of ACPI, PMU support under nested virtualization, and support for stage-1 permission indirection and permission overlays.
KVM with RISC-V hardware now allows accelerating KVM RISC-V when running as a guest.
KVM on the PowerPC side has finished removing obsolete references to PowerPC 970 support, which was removed from the kernel back in 2014.
The KVM x86 (x86_64) changes continue to be quite heavy. There is work on reducing vCPU jitter, batching TLB flushes when dirty page logging is toggled off so it's much quicker disabling dirty logging (3x difference), dropping the shrinker that was doing a poor job at reclaiming shadow page tables in low-memory scenarios, advertising new CPU instructions found with upcoming Intel Clearwater Forest server processors, advertising the AMD_IBPB_RET bit to user-space, and various fixes.
More details on all of these KVM changes now merged for the Linux 6.13 kernel via this pull request.
Kernel-based Virtual Machine maintainer Paolo Bonzini of Red Hat explained of the biggest change for the KVM feature code in Linux 6.13:
"The biggest change here is eliminating the awful idea that KVM had, of essentially guessing which pfns are refcounted pages. The reason to do so was that KVM needs to map both non-refcounted pages (for example BARs of VFIO devices) and VM_PFNMAP/VM_MIXMEDMAP VMAs that contain refcounted pages. However, the result was security issues in the past, and more recently the inability to map VM_IO and VM_PFNMAP memory that _is_ backed by struct page but is not refcounted. In particular this broke virtio-gpu blob resources (which directly map host graphics buffers into the guest as "vram" for the virtio-gpu device) with the amdgpu driver, because amdgpu allocates non-compound higher order pages and the tail pages could not be mapped into KVM.
This requires adjusting all uses of struct page in the per-architecture code, to always work on the pfn whenever possible. The large series that did this, from David Stevens and Sean Christopherson, also cleaned up substantially the set of functions that provided arch code with the pfn for a host virtual addresses. The previous maze of twisty little passages, all different, is replaced by five functions (__gfn_to_page, __kvm_faultin_pfn, the non-__ versions of these two, and kvm_prefetch_pages) saving almost 200 lines of code."
KVM on ARM with Linux 6.13 meanwhile adds PSCIv1.3 SYSTEM_OFF2 support for requesting hibernation, similar to the S4 state of ACPI, PMU support under nested virtualization, and support for stage-1 permission indirection and permission overlays.
KVM with RISC-V hardware now allows accelerating KVM RISC-V when running as a guest.
KVM on the PowerPC side has finished removing obsolete references to PowerPC 970 support, which was removed from the kernel back in 2014.
The KVM x86 (x86_64) changes continue to be quite heavy. There is work on reducing vCPU jitter, batching TLB flushes when dirty page logging is toggled off so it's much quicker disabling dirty logging (3x difference), dropping the shrinker that was doing a poor job at reclaiming shadow page tables in low-memory scenarios, advertising new CPU instructions found with upcoming Intel Clearwater Forest server processors, advertising the AMD_IBPB_RET bit to user-space, and various fixes.
More details on all of these KVM changes now merged for the Linux 6.13 kernel via this pull request.
5 Comments