GPU pass-through API support
This document contains the software design for GPU pass-through API support. GPU pass-through is already possible at the moment via the VM.other_config
ci key, which is generally usable for PCI pass-through. This design proposal introduces a new GPU model to the XenAPI through which VMs can be assigned GPUs in a more flexible and convenient way.
Rather than modelling GPU pass-through from a PCI perspective, and having the user manipulate PCI devices directly, we are taking a higher-level view by introducing a dedicated graphics model. The graphics model is similar to the networking and storage model, in which virtual and physical devices are linked through an intermediate abstraction layer (e.g. the "Network" class in the networking model).
The basic graphics model is as follows:
A host owns a number of physical GPU devices (pGPUs), each of which is available for passing through to a VM.
A VM may have a virtual GPU device (vGPU), which means it expects to have access to a GPU when it is running.
Identical pGPUs are grouped across a resource pool in GPU groups. GPU groups are automatically created and maintained by XS.
A GPU group connects vGPUs to pGPUs in the same way as VIFs are connected to PIFs by Network objects: for a VM v having a vGPU on GPU group p to run on host h, host h must have a pGPU in GPU group p and pass it through to VM v.
VM start and non-live migration rules are analogous to the network API and follow the above rules.
In case a VM that has a vGPU is started, while no pGPU available, an exception will occur and the VM won't start. As a result, in order to guarantee that a VM always has access to a pGPU, the number of vGPUs should not exceed the number of pGPUs in a GPU group.
Currently, the following restrictions apply:
Hotplug is not supported.
Suspend/resume and checkpointing (memory snapshots) are not supported.
Live migration (XenMotion) is not supported.
No more than one GPU per VM will be supported.
Only Windows guests will be supported.