qemu-storage-daemon: fuse export deadlock
Host environment
- Operating system: Arch Linux
- OS/kernel version: 6.7.4-arch1-1
- Architecture: x86_64
- QEMU flavor: qemu-storage-daemon
- QEMU version: 8.2.1
- QEMU command line:
dd if=/dev/random of=src_disk.raw count=4194304 bs=1024
qemu-img create -b src_disk.raw -o backing_fmt=raw -f qcow2 layer1.qcow2
touch /tmp/fuse_exp
qemu-storage-daemon \
--chardev socket,path=qmp.sock,server=on,wait=off,id=char1 \
--monitor chardev=char1 \
--blockdev driver=file,node-name=raw_src,filename=src_disk.raw \
--blockdev driver=raw,node-name=src,file=raw_src \
--blockdev driver=file,node-name=root_file,filename=layer1.qcow2 \
--blockdev driver=qcow2,node-name=root,file=root_file,backing=src \
--nbd-server addr.type=inet,addr.host=0,addr.port=10809 \
--blockdev driver=copy-on-read,node-name=root_crw,file=root \
--export type=fuse,id=fuse-root,node-name=root_crw,mountpoint=/tmp/fuse_exp,writable=off,growable=off
Reading from a fuse export that uses blockdev with copy-on-read
and also issuing a block-stream
causes a deadlock.
QSD must be killed with signal 9.
Steps to reproduce
- Start QSD
- Issue a
block-stream
and a read from the fuse export at the same time
Term 1:
(QEMU) block-stream device=root job-id=job1
{"return": {}}
(QEMU)
{'timestamp': {'seconds': 1708386076, 'microseconds': 965781}, 'event': 'JOB_STATUS_CHANGE', 'data': {'status': 'created', 'id': 'job1'}}
{'timestamp': {'seconds': 1708386076, 'microseconds': 965838}, 'event': 'JOB_STATUS_CHANGE', 'data': {'status': 'running', 'id': 'job1'}}
(QEMU)
(QEMU)
(QEMU)
(QEMU) query-block-jobs
<HANGS>
Term 2:
dd if=/tmp/fuse_exp of=/dev/null bs=1M skip=2000
<HANGS>
$ pidof qemu-storage-daemon
92313
$ sudo cat /proc/92313/task/92313/stack
[<0>] do_sys_poll+0x4e1/0x5d0
[<0>] __x64_sys_ppoll+0xe2/0x170
[<0>] do_syscall_64+0x64/0xe0
[<0>] entry_SYSCALL_64_after_hwframe+0x6e/0x76
$ sudo cat /proc/92313/task/92314/stack
[<0>] futex_wait_queue+0x63/0x90
[<0>] __futex_wait+0x14f/0x1c0
[<0>] futex_wait+0x77/0x110
[<0>] do_futex+0xcb/0x190
[<0>] __x64_sys_futex+0x129/0x1e0
[<0>] do_syscall_64+0x64/0xe0
[<0>] entry_SYSCALL_64_after_hwframe+0x6e/0x76
Additional information
This might also be a general between block-stream
and copy-on-read
but I could only trigger the problem with FUSE and not NBD. E.g this command does not deadlock:
--export type=nbd,id=nbd-root,node-name=root_crw,name=root_crw,writable=off
nbdfuse /tmp/tmp.69dRvNXj1O/disk nbd://localhost:10809/root_crw
dd if=/tmp/tmp.69dRvNXj1O/disk of=/dev/null bs=1M skip=2000
Edited by lukts30