Assertion failure with active vhost NIC when snapshot_save_job_bh() is executed as part of a vCPU thread's aio_poll()
Host environment
-
Operating system: Proxmox VE 8.1 (but using upstream QEMU)
-
OS/kernel version: Linux pve8a1 6.5.11-7-pve #1 SMP PREEMPT_DYNAMIC PMX 6.5.11-7 (2023-12-05T09:44Z) x86_64 GNU/Linux
-
Architecture: x86
-
QEMU flavor: qemu-system-x86_64
-
QEMU version: 8.2.0 but originally found with an older version
-
QEMU command line:
./qemu-system-x86_64 \ -accel kvm \ -chardev 'socket,id=qmp,path=/var/run/qemu-server/112.qmp,server=on,wait=off' \ -mon 'chardev=qmp,mode=control' \ -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \ -mon 'chardev=qmp-event,mode=control' \ -pidfile /var/run/qemu-server/112.pid \ -drive 'if=pflash,unit=0,format=raw,readonly=on,file=/usr/share/pve-edk2-firmware//OVMF_CODE_4M.fd' \ -drive 'if=pflash,unit=1,id=drive-efidisk0,format=raw,file=/dev/lvmthin/vm-112-disk-0,size=540672' \ -nodefaults \ -smp 4 \ -vnc 'unix:/var/run/qemu-server/112.vnc,password=on' \ -device 'pci-bridge,id=pci.3,chassis_nr=3,bus=pci.0,addr=0x5' \ -device 'VGA,id=vga,bus=pci.0,addr=0x2' \ -device 'virtio-scsi-pci,id=virtioscsi1,bus=pci.3,addr=0x2' \ -drive 'file=/mnt/pve/nfs/images/112/vm-112-disk-1.qcow2,if=none,id=drive-scsi1,format=qcow2,node-name=scsi1' \ -device 'scsi-hd,bus=virtioscsi1.0,channel=0,scsi-id=0,lun=1,drive=drive-scsi1,id=scsi1' \ -netdev 'type=tap,id=net0,ifname=tap112i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' \ -device 'virtio-net-pci,mac=F2:ED:89:F4:D6:56,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256' \ -machine 'type=pc'
Description of problem
During a snapshot-save
QMP command the snapshot_save_job_bh()
bottom half can end up being executed as part of a vCPU thread's aio_poll()
. This is problematic and can lead to an assertion failure (see below for backtrace) when there is an active vhost network device:
qemu-system-x86_64: ../hw/net/virtio-net.c:3835: virtio_net_pre_save: Assertion `!n->vhost_started' failed.
Steps to reproduce
It is very racy and very difficult to reproduce when actually taking snapshots. So the way I can get it pretty reliably is:
- Issue
snapshot-save
QMP commands with an invalid device ID in a loop. At the same time, have the guest write to the pflash. - In GDB, wait for
snapshot_save_job_bh()
to be hit by a vCPU thread. - Manually change the device ID to a valid one (
scsi1
in the example) so that taking a snapshot will actually be attempted. - Continue in GDB and the assertion failure will happen.
Additional information
Full backtrace:
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
#1 0x00007f1de5ae3d9f in __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2 0x00007f1de5a94f32 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3 0x00007f1de5a7f472 in __GI_abort () at ./stdlib/abort.c:79
#4 0x00007f1de5a7f395 in __assert_fail_base (fmt=0x7f1de5bf3a90 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x563cb92d56e7 "!n->vhost_started",
file=file@entry=0x563cb92d56d0 "../hw/net/virtio-net.c", line=line@entry=3835, function=function@entry=0x563cb92d65a0 <__PRETTY_FUNCTION__.2> "virtio_net_pre_save") at ./assert/assert.c:92
#5 0x00007f1de5a8de32 in __GI___assert_fail (assertion=assertion@entry=0x563cb92d56e7 "!n->vhost_started", file=file@entry=0x563cb92d56d0 "../hw/net/virtio-net.c", line=line@entry=3835,
function=function@entry=0x563cb92d65a0 <__PRETTY_FUNCTION__.2> "virtio_net_pre_save") at ./assert/assert.c:101
#6 0x0000563cb8ebf23c in virtio_net_pre_save (opaque=<optimized out>) at ../hw/net/virtio-net.c:3835
#7 virtio_net_pre_save (opaque=<optimized out>) at ../hw/net/virtio-net.c:3829
#8 0x0000563cb917515b in vmstate_save_state_v (f=0x7f1dc43aec30, vmsd=0x563cb9e5a580 <vmstate_virtio_net>, opaque=0x563cbbb6eb40, vmdesc=0x7f1dc4080040, version_id=11, errp=0x7f1dcbdf9908)
at ../migration/vmstate.c:359
#9 0x0000563cb9175d0c in vmstate_save_state_with_err (f=<optimized out>, vmsd=<optimized out>, opaque=<optimized out>, vmdesc_id=<optimized out>, errp=<optimized out>) at ../migration/vmstate.c:347
#10 0x0000563cb8d9a1b2 in vmstate_save (f=f@entry=0x7f1dc43aec30, se=se@entry=0x563cbbcbdc70, vmdesc=vmdesc@entry=0x7f1dc4080040) at ../migration/savevm.c:1037
#11 0x0000563cb8d9d6e6 in qemu_savevm_state_complete_precopy_non_iterable (f=f@entry=0x7f1dc43aec30, in_postcopy=in_postcopy@entry=false, inactivate_disks=inactivate_disks@entry=false)
at ../migration/savevm.c:1553
#12 0x0000563cb8d9daa2 in qemu_savevm_state_complete_precopy (f=f@entry=0x7f1dc43aec30, iterable_only=iterable_only@entry=false, inactivate_disks=inactivate_disks@entry=false) at ../migration/savevm.c:1628
#13 0x0000563cb8da076e in qemu_savevm_state (errp=0x7f1dc42c59f0, f=0x7f1dc43aec30) at ../migration/savevm.c:1734
#14 save_snapshot (name=<optimized out>, overwrite=overwrite@entry=false, vmstate=<optimized out>, has_devices=has_devices@entry=true, devices=0x7f1dc4096600, errp=0x7f1dc42c59f0) at ../migration/savevm.c:3131
#15 0x0000563cb8da0926 in snapshot_save_job_bh (opaque=0x7f1dc42c5930) at ../migration/savevm.c:3430
#16 0x0000563cb9110036 in aio_bh_poll (ctx=ctx@entry=0x563cba818b40) at ../util/async.c:216
#17 0x0000563cb90fa09a in aio_poll (ctx=ctx@entry=0x563cba818b40, blocking=blocking@entry=true) at ../util/aio-posix.c:722
#18 0x0000563cb8fb1015 in bdrv_poll_co (s=0x7f1dcbdf9db0) at /home/febner/repos/qemu/block/block-gen.h:43
#19 blk_pwrite (blk=<optimized out>, offset=offset@entry=91136, bytes=bytes@entry=512, buf=0x7f1dc9a16400, flags=flags@entry=0) at block/block-gen.c:2012
#20 0x0000563cb8bb8985 in pflash_update (pfl=pfl@entry=0x563cbaa84bf0, offset=91136, offset@entry=91526, size=size@entry=1) at ../hw/block/pflash_cfi01.c:394
#21 0x0000563cb8bbacd8 in pflash_write (be=0, width=1, value=63, offset=91526, pfl=0x563cbaa84bf0) at ../hw/block/pflash_cfi01.c:522
#22 pflash_mem_write_with_attrs (opaque=0x563cbaa84bf0, addr=91526, value=<optimized out>, len=1, attrs=...) at ../hw/block/pflash_cfi01.c:681
#23 0x0000563cb8f06e2e in access_with_adjusted_size (addr=addr@entry=91526, value=value@entry=0x7f1dcbdf9f58, size=size@entry=1, access_size_min=<optimized out>, access_size_max=<optimized out>,
access_fn=0x563cb8f06710 <memory_region_write_with_attrs_accessor>, mr=<optimized out>, attrs=...) at ../system/memory.c:573
#24 0x0000563cb8f07e59 in memory_region_dispatch_write (mr=mr@entry=0x563cbaa84fb0, addr=addr@entry=91526, data=<optimized out>, op=<optimized out>, attrs=attrs@entry=...) at ../system/memory.c:1528
#25 0x0000563cb8f0f43c in flatview_write_continue (fv=fv@entry=0x7f1dc42e4720, addr=addr@entry=4290864518, attrs=..., attrs@entry=..., ptr=ptr@entry=0x7f1de7946028, len=len@entry=1, addr1=<optimized out>,
l=<optimized out>, mr=0x563cbaa84fb0) at ../system/physmem.c:2714
#26 0x0000563cb8f0f6b3 in flatview_write (fv=0x7f1dc42e4720, addr=addr@entry=4290864518, attrs=attrs@entry=..., buf=buf@entry=0x7f1de7946028, len=len@entry=1) at ../system/physmem.c:2756
#27 0x0000563cb8f12959 in address_space_write (len=1, buf=0x7f1de7946028, attrs=..., addr=4290864518, as=0x563cb9fd8ec0 <address_space_memory>) at ../system/physmem.c:2863
#28 address_space_rw (as=0x563cb9fd8ec0 <address_space_memory>, addr=4290864518, attrs=attrs@entry=..., buf=buf@entry=0x7f1de7946028, len=1, is_write=<optimized out>) at ../system/physmem.c:2873
#29 0x0000563cb8f64ab8 in kvm_cpu_exec (cpu=cpu@entry=0x563cbac066d0) at ../accel/kvm/kvm-all.c:2915
#30 0x0000563cb8f65ce5 in kvm_vcpu_thread_fn (arg=arg@entry=0x563cbac066d0) at ../accel/kvm/kvm-accel-ops.c:51
#31 0x0000563cb90fd1c8 in qemu_thread_start (args=0x563cbaac33c0) at ../util/qemu-thread-posix.c:541
#32 0x00007f1de5ae2044 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#33 0x00007f1de5b6261c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81