Intel "coIOMMU" Can Help With Performance For VMs When Using Direct I/O Access
Currently when directly assigning I/O devices to virtual machines the guest memory needs to be statically pinned unless using a vIOMMU setup in which case it does not but there are performance implications there as well. Intel engineers though have been working on a virtual IOMMU implementation with DMA buffer tracking to overcome these limitations.
With Intel's proposed "coIOMMU" implementation there is fine-grained pinning and vendor agnostic support for emulated or para-virtualized vIOMMUs. Yu Zhang of Intel presented at KVM Forum 2020 on this coIOMMU effort.
Intel engineers find that the current static pinning approach when making use of direct I/O can lead to significantly longer virtual machine creation time (up to 73x longer if allocating ~128GB of system memory) and also prevents many memory optimizations. Making use of Virtual IOMMU can lead to significant performance costs.
With coIOMMU there is a decoupling of DMA tracking and DMA remapping in vIOMMU that is designed to be low-cost, non-intrusive, widely-applicable, and extensible. Intel's proof-of-concept coIOMMU implementation is an extension of Intel VT-d and can be applied to both emulated and para-virtualized IOMMUs. Intel engineers are looking to ultimately provide it upstream based on VirtIO IOMMU.
This PDF slide deck goes overall all of the details for those interested. But the main takeaway for those interested is that their performance tests show coIOMMU now being able to perform nearly the same as direct I/O without vIOMMU.
With Intel's proposed "coIOMMU" implementation there is fine-grained pinning and vendor agnostic support for emulated or para-virtualized vIOMMUs. Yu Zhang of Intel presented at KVM Forum 2020 on this coIOMMU effort.
Intel engineers find that the current static pinning approach when making use of direct I/O can lead to significantly longer virtual machine creation time (up to 73x longer if allocating ~128GB of system memory) and also prevents many memory optimizations. Making use of Virtual IOMMU can lead to significant performance costs.
With coIOMMU there is a decoupling of DMA tracking and DMA remapping in vIOMMU that is designed to be low-cost, non-intrusive, widely-applicable, and extensible. Intel's proof-of-concept coIOMMU implementation is an extension of Intel VT-d and can be applied to both emulated and para-virtualized IOMMUs. Intel engineers are looking to ultimately provide it upstream based on VirtIO IOMMU.
This PDF slide deck goes overall all of the details for those interested. But the main takeaway for those interested is that their performance tests show coIOMMU now being able to perform nearly the same as direct I/O without vIOMMU.
9 Comments