memory leak in vnc tls handshaking

Host environment

  • Operating system: Linux
  • OS/kernel version: 5.10
  • Architecture: x86,ARM
  • QEMU flavor: qemu-system-x86_64, qemu-aarch64
  • QEMU version: 6.2
  • QEMU command line:
    ./qemu-system-x86_64 -name qemu_vm -machine pc-i440fx-6.2,accel=kvm,usb=off -cpu host -smp 4 -m 4G -device nec-usb-xhci -device usb-kbd -device usb-tablet -device usb-storage,drive=install -device virtio-balloon -drive if=none,id=install,format=raw,file=/Images/TestImg -device virtio-gpu-pci -qmp unix:/tmp/qmp.sock,server,nowait -serial stdio -object '{"qom-type":"tls-creds-x509","id":"vnc-tls-creds0","dir":"/etc/pki/libvirt-vnc","endpoint":"server","verify-peer":true}' -vnc 0.0.0.0:0,tls-creds=vnc-tls-creds0 -device VGA,id=video0,vgamem_mb=16,bus=pci.0,addr=0x6 -D /tmp/log

Emulated/Virtualized environment

  • Operating system: centos8.2
  • OS/kernel version: 4.18.0-193.el8.x86_64
  • Architecture: x86,arm

Description of problem

When qemu is performing a TLS handshake for VNC, it will monitor vs->sioc in the qio_channel_tls_handshake_task. If the number of concurrent VNC connections exceeds the maximum number allowed by qemu, vnc_connect will traverse all connection requests in share mode VNC_SHARE_MODE_CONNECTING and disconnect the first one. If the disconnected request has not yet entered qio_channel_tls_handshake_io, it will cause the data pointer allocated in qio_channel_tls_handshake_task to leak directly, leading to an indirect leak of the task and its associated pointers.

Steps to reproduce

To make it easier to reproduce the issue:

  1. Change the condition in qio_channel_tls_handshake_task to a special value (for example, G_IO_PRI).
  2. Change vd->connections_limit to a smaller value (such as 2) in vnc_connect.
  3. Specify enable-sanitizers when compiling qemu
  4. Use the compiled qemu binary to create a virtual machine and continuously connect to it via VNC TLS.
  5. Shut down the virtual machine and check the logs.

Reported by my colleague: jiangyegen@h-partners.com

Additional information

The connection count verification and disconnection logic in vnc_connect comes from this commit: https://github.com/qemu/qemu/commit/e5f34cdd2da54f28d90889a3afd15fad2d6105ff

Its description states that the purpose of this modification is to prohibit new connections when the number of concurrent connections reaches the limit, but the code actually disconnects other connections, which indirectly led to this memory leak.

I tried removing the QTAILQ_FOREACH, disconnecting only the current connection when the connection limit is reached. It seems that memory leaks will no longer be triggered, but I'm not sure if there will be other issues.

Assignee Loading
Time tracking Loading