Skip to content

Questions about vfio-pci

I think I may have found a problem with VFIO-PCI.

Host environment

  • OS/kernel version:

    linux5.10 (The problem can also be reproduced in the latest Linux mainline version.)

  • Architecture:

    ARM

  • QEMU flavor:

    qemu-system-aarch64

  • QEMU version:

    qemu v6.2.0 (qemu v8.2.0 also reproduces)

  • QEMU command line:

    taskset -c 2-3 qemu-system-aarch64 \

    -machine virt,kernel_irqchip=on,gic-version=3 \

    -kernel ./Image \

    -initrd ./minifs.cpio.gz \

    -bios ./QEMU_EFI.fd \

    -cpu host \

    -enable-kvm \

    -net none -nographic \

    -m 2G,maxmem=100G,slots=3 \

    -smp 2 \

    -append 'rdinit=init console=ttyAMA0 ealycon=pl0ll,0x90000000 pcie_ports=native pciehp.pciehp_debug=1' \

    -device ioh3420,id=root_port1,chassis=1 \

    -device vfio-pci,host=37:01.0,id=net0,bus=root_port1

Description of problem

When I use VFIO-PCI to pass through an hns3 device and load the driver to the VM to enable the hns3 network port, there is a possibility that the failure occurs.

Steps to reproduce

  1. Start the VM and load the hns3 driver.

  2. enable net port

    ifconfig eth0 10.10.10.10/24 up

  3. ping host

    ping 10.10.10.11 -c 3

Additional information

I have the following findings:

  1. The problem can be reproduced in different kernel versions and QEMU versions.
  2. The problem does not recur when the number of vCPUs is 1.
  3. It is irrelevant to the GIC version.

the hns3 relately logic:

image.png

If the VM has two vCPUs, "ifconfig eth0 10.10.10.10/24 up" command performs two sequential enable_irq operations(vector_num=2). The enable_irq will trap into KVM for interrupt configuration and exit to QEMU for PCI device emulation. When emulating interrupt enabling in QEMU, vfio_[intx/msi/msix]_enable calls vfio_disable_interrupts to disable all interrupts on the vdev.

image.png

vfio_disable_interrupts in QEMU calls the kernel vfio driver interface vfio_pci_set_irqs_ioctl

image.png

dump stack as above. and then its_irq_domain_deactivate will call its_send_discard to discard the interrupt on the device.

If an interrupt is handled after the first enable_irq but the second enable_irq discards it, this inconsistency leads to network port enablement failures.

It puzzles me. why does the vfio-pci disable all interrupts of the device before enabling irqs?

Edited by Jinqian Yang
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information