PCI Multifunction card hotplug/hotunplug support (QEMU driver)
Goal
Some attempts to implement PCI Multifunction hotplug/hotunplug in Libvirt were made in the last few years. One attempt was made in 2018 by Shivaprasad G Bhat [1], with a handful of other rebases and revisions of that original work made by myself (last one is [2]). This bug is an attempt to document what we've done so anyone else interested in pushing this forward has something to build upon.
Technical details
First, a quick summary of how this feature already works in QEMU. To hotplug a PCI Multifunction device in QEMU, all non-zero functions must be hotplugged first. QEMU will queue these hotplug events, and only after hotplugging function zero the hotplug events will be sent to the virtual machine. There is no restriction about hotplugging all functions of the device, so partial hotplug is supported. This is the behavior for both x86 and ppc64 archs.
Hotunplugging the function zero will remove all the slot in both x86 and ppc64 QEMU guests. QEMU version 4.1.0 and older requires all non-zero functions to be hotunplugged first for ppc64 guests.
I have no insights to give on how s390x behaves regarding PCI Multifunction hotplug/hotunplug.
- Libvirt implementation
To allow for partial hotplug, like QEMU does, we can't assume that all the available functions are going to be added to the guest. This led us to a design where a new XML called '' was defined, where a list of hostdevs is provided to be hotplugged in a single operation. E.g:
<devices>
<hostdev mode='subsystem' type='pci' managed='yes'>
<driver name='vfio'/>
<source>
<address domain='0x0005' bus='0x90' slot='0x01' function='0x1'/>
</source>
</hostdev>
<hostdev mode='subsystem' type='pci' managed='yes'>
<driver name='vfio'/>
<source>
<address domain='0x0005' bus='0x90' slot='0x01' function='0x2'/>
</source>
</hostdev>
<hostdev mode='subsystem' type='pci' managed='yes'>
<driver name='vfio'/>
<source>
<address domain='0x0005' bus='0x90' slot='0x01' function='0x3'/>
</source>
</hostdev>
<hostdev mode='subsystem' type='pci' managed='yes'>
<driver name='vfio'/>
<source>
<address domain='0x0005' bus='0x90' slot='0x01' function='0x0'/>
</source>
</hostdev>
</devices>
The bulk of the logic changes consisted on how to auto-assign the virtual functions in a way that they ended up in the same slot after hotplug (a common constraint of PCI Multifunction devices, at least in ppc64) and the necessary logic to handle the hotplug of the XML. The code interprets the XML hotplug as a sequence of individual hostdev hotplugs that, when hotplugging function zero, would trigger multiple 'device_add' QAPI calls to QEMU to complete the process. The same 'virsh attach-device' is used with this XML format.
Given a running virtual machine called 'vm' and the XML above representing a PCI multifunction device saved in a 'hotplug.xml' file, this command would hotplug this device to the vm:
$ virsh attach-device vm hotplug.xml --live --config
Hotunplugging is done in the same manner, with a slightly easier driver implementation because removing function zero will remove all the functions. For the same scenario mentioned above, this command would hotunplug the PCI multifunction device:
$ virsh detach-device vm hotplug.xml --live --config
Additional information
More information can be found in the mailing list discussions down below. I have a gitlab repo with these patches rebased up to Libvirt 6.3.0, feel free to use it.