virtio-gpu: Only black screen observed after resuming when guest vm do S3
Host environment
-
Operating system: Xen 4.18-unstable on Ubuntu 22.04.2 LTS
-
OS/kernel version: Linux 6.0
-
Architecture: x86
-
QEMU flavor: qemu-system-i386
-
QEMU version: 7.2.0
-
QEMU command line:
./usr/local/lib/xen/bin/qemu-system-i386 -xen-domid 2 -no-shutdown -chardev socket,id=libxl-cmd,path=/var/run/xen/qmp-libxl-2,server=on,wait=off -mon chardev=libxl-cmd,mode=control -chardev socket,id=libxenstat-cmd,path=/var/run/xen/qmp-libxenstat-2,server=on,wait=off -mon chardev=libxenstat-cmd,mode=control -nodefaults -no-user-config -name domAll -vnc 127.0.0.1:0,to=99 -display none -display sdl -boot order=cda -usb -usbdevice tablet -smp 12,maxcpus=12 -device rtl8139,id=nic0,netdev=net0,mac=00:16:3e:63:5e:99 -netdev type=tap,id=net0,ifname=vif2.0-emu,br=ovsbr0,script=no,downscript=no -machine xenfv,suppress-vmdesc=on -display sdl,gl=on -device virtio-vga-gl,context_init=true,blob=true,hostmem=4G -m 4096 -drive file=/home/cjq/code/u2204.qcow2,if=ide,index=0,media=disk,format=qcow2,cache=writeback
Emulated/Virtualized environment
- Operating system: Ubuntu 22.04.2 LTS
- OS/kernel version: Linux 6.0
- Architecture: x86
Description of problem
On Xen hypervisor, host(dom0) is PVH, guest(domU) is hvm, config virtio-gpu for guest.
Phenomenon
If you do S3 for guest. After resuming, then guest system can resume well, but the display of guest can't, you can only see a black screen and without mouse icon.
Even though there is no mouse icon, you can still try to move and click the mouse, and then you will see timeout error from the dmesg. It is guest's virtio-gpu frontend tried to send VIRTIO_GPU_CMD_MOVE_CURSOR requests to backend in Qemu, but didn't get response.
Problem
After an in-depth investigation, I found there are two problems that caused the display can't come back.
- First: In the S3 process, Qemu will do virtio_reset()->__virtio_queue_reset()->(vdev->vq[i].vring.desc = 0; //clear all virtual queues information), but guest didn’t delete the virtual queues on its side. So, after resuming, guest still can send ctrl/cursor requests to Qemu, but Qemu can’t receive the request(Qemu will get "vring.desc == 0" and return directly in virtio_queue_notify).
- Second: In the S3 process, Qemu called virtio_reset()->virtio_gpu_gl_reset()->virtio_gpu_reset->virtio_gpu_resource_destroy to destroy all resources which are created by using cmd VIRTIO_GPU_CMD_RESOURCE_CREATE_*, that caused guest's display can’t come back and continue to the time when it was suspended.
Solution
I have made patches to solve above problems. S3 function can work very well by using my patches.
For problem one: Add freeze and restore functions for virtio-gpu in Linux kernel. When guest do suspension, call virtio_gpu_freeze to delete queues. When guest do resuming, call virtio_gpu_restore to re-initialize queues.
For problem two: Add a new feature flag "freeze_mode" for virtio-gpu in Qemu codes. When guest do suspension, notifies Qemu that virtio-gpu enters "freeze_S3" mode, and then Qemu will not destroy the resources. When guest do resuming, notifies Qemu that virtio-gpu enters "unfreeze" mode, and then Qemu will do normal actions and have no other impacts.
Please refer to the links(attached in the "Additional information" part) for details of community's comments and discussions about my implementation.
Steps to reproduce
- In guest vm run "sudo su root" & "echo mem > /sys/power/state"
- In host run "sudo xl trigger <guest id> s3resume"
Additional information
Patches in virtio-spec
v1: https://lists.oasis-open.org/archives/virtio-comment/202306/msg00595.html
v2: https://lists.oasis-open.org/archives/virtio-comment/202307/msg00160.html
v3: https://lists.oasis-open.org/archives/virtio-comment/202307/msg00209.html
Patches in Qemu
v1: https://lore.kernel.org/qemu-devel/20230608025655.1674357-2-Jiqian.Chen@amd.com/
v2: https://lore.kernel.org/qemu-devel/20230630070016.841459-1-Jiqian.Chen@amd.com/T/#t
v3: https://lore.kernel.org/qemu-devel/20230719074726.1613088-1-Jiqian.Chen@amd.com/T/#t
v4: https://lore.kernel.org/qemu-devel/20230720120816.8751-1-Jiqian.Chen@amd.com/
Patches in kernel
v1: https://lore.kernel.org/lkml/20230608063857.1677973-1-Jiqian.Chen@amd.com/
v2: https://lore.kernel.org/lkml/20230630073448.842767-1-Jiqian.Chen@amd.com/T/#t
v3: https://lore.kernel.org/lkml/20230720115805.8206-1-Jiqian.Chen@amd.com/T/#t