QEMU Guest Agent (qga) high CPU usage (1 core at 100%). May happen with guest-network-get-interfaces. Strace says: EAGAIN (Resource temporarily unavailable)
## Host environment - Operating system: Fedora 37 - OS/kernel version: Linux fedora 6.2.9-200.fc37.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Mar 30 22:31:57 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux - Architecture: x86_64 - QEMU flavor: qemu-system-x86_64 - QEMU version: QEMU emulator version 7.0.0 (qemu-7.0.0-15.fc37) - QEMU command line: ``` #!/bin/bash args=( -name "108-TEST-VM",debug-threads=on -pidfile /run/test-vm-108.pid -smbios type=1,uuid="49a1070a-3cf0-40a4-9d6f-cc142bb4f7ea" -overcommit mem-lock=off -serial none -parallel none -k en-us -vga none -global nec-usb-xhci.msi=off -global kvm-pit.lost_tick_policy=discard -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -no-hpet -enable-kvm -device usb-ehci,id=ehci -device qemu-xhci,id=xhci -device usb-tablet,bus=ehci.0 -device usb-kbd,bus=ehci.0 -drive if=pflash,format=raw,readonly=on,file=/usr/share/edk2/ovmf/OVMF_CODE.fd -netdev tap,id=netdev0,vhost=on,script=ifup.sh,downscript=ifdown.sh -device virtio-net-pci,netdev=netdev0,id=nic0,mac=AA:BB:CC:DD:EE:FF -vnc :8 -display none -device VGA -device virtio-serial-pci -chardev socket,id=charchannel0,server=on,wait=off,path=/run/test-vm-108.qmp -mon chardev=charchannel0,mode=control -chardev socket,id=charchannel1,server=on,wait=off,path=/run/test-vm-108.qga -device virtserialport,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0 -drive id=drive0,file=108.qcow2,if=none,format=qcow2,cache=writeback,discard=unmap,detect-zeroes=unmap -device nvme,drive=drive0,serial=AABBCCDD -machine q35,smm=on,vmport=off,dump-guest-core=off,kernel_irqchip=on,mem-merge=off -rtc base=localtime,clock=host,driftfix=slew -cpu host -smp 8 -m 8G -boot menu=on,strict=on,splash-time=5000,reboot-timeout=5000 -no-user-config -nodefaults ) /usr/bin/qemu-system-x86_64 "${args[@]}" ``` ## Emulated/Virtualized environment - Operating system: Fedora 37 - OS/kernel version: Linux fedora 6.2.9-200.fc37.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Mar 30 22:31:57 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux - Architecture: x86_64 ## Description of problem I have a VM that has the QEMU guest agent installed. I use the QGA to get information periodically about the network interfaces. Meaning, I execute the `guest-network-get-interfaces` in a period around 1-2 seconds each. After a while (maybe a day or so) the QGA seems to lock up with the CPU at 100% in 1 core. It does not reply to more commands, and restarting the service sometimes doesn't work, so a hard reboot it is. `dmesg` doesn't show anything useful/relevant. When attempting to edit the `qemu-guest-agent.service` and append `/usr/bin/strace` to it, I can get this in a loop: ``` strace[114154]: write(4, "{\"return\": [{\"name\": \"lo\", \"ip-a"..., 2047) = -1 EAGAIN (Resource temporarily unavailable) strace[114154]: write(4, "{\"return\": [{\"name\": \"lo\", \"ip-a"..., 2047) = -1 EAGAIN (Resource temporarily unavailable) strace[114154]: write(4, "{\"return\": [{\"name\": \"lo\", \"ip-a"..., 2047) = -1 EAGAIN (Resource temporarily unavailable) strace[114154]: write(4, "{\"return\": [{\"name\": \"lo\", \"ip-a"..., 2047) = -1 EAGAIN (Resource temporarily unavailable) strace[114154]: write(4, "{\"return\": [{\"name\": \"lo\", \"ip-a"..., 2047) = -1 EAGAIN (Resource temporarily unavailable) strace[114154]: write(4, "{\"return\": [{\"name\": \"lo\", \"ip-a"..., 2047) = -1 EAGAIN (Resource temporarily unavailable) strace[114154]: write(4, "{\"return\": [{\"name\": \"lo\", \"ip-a"..., 2047) = -1 EAGAIN (Resource temporarily unavailable) strace[114154]: write(4, "{\"return\": [{\"name\": \"lo\", \"ip-a"..., 2047) = -1 EAGAIN (Resource temporarily unavailable) ``` I don't have more knowledge to debug this further. I can help to provide more info if some guidance is provided. **Don't know if it helps/affects**, but the guest VM is running Docker with around 10 containers or so, so when QGA works, I get around 18 network interfaces, counting loopback, docker `veth`s and `br` interfaces. ## Steps to reproduce 1. Create a VM with Fedora 37 2. Install the QEMU Guest Agent 3. Call `guest-network-get-interfaces` in a loop every 1-2 seconds (after it finishes) through QGA using the unix socket using the provided python script, called as: `python qga.py --socket /run/test-vm-108.qga '{ "execute": "guest-network-get-interfaces" }'` 4. Eventually, the guest agent will lock up at 100% CPU usage on 1 core ## Additional information Python script used to call QGA: ``` import argparse import socket import sys def main(): buf_size = 1024 timeout_secs = .5 parser = argparse.ArgumentParser() parser.add_argument('--socket', required=True, help='Path to Unix socket') parser.add_argument('request', help='Request to send') args = parser.parse_args() unix_socket_path = args.socket request = args.request try: with socket.socket(socket.AF_UNIX, socket.SOCK_STREAM) as sock: sock.settimeout(timeout_secs) sock.connect(unix_socket_path) request_bytes = request.encode('utf-8') sock.sendall(request_bytes) response_bytes = b'' received_bytes = sock.recv(buf_size) response_bytes += received_bytes sock.setblocking(False) while True: try: received_bytes = sock.recv(buf_size) if not received_bytes: break response_bytes += received_bytes except (BlockingIOError, TimeoutError): break except (FileNotFoundError, ConnectionRefusedError): sock.close() sys.exit() response = response_bytes.decode('utf-8').strip() print(response) except (TimeoutError, FileNotFoundError, BlockingIOError, ConnectionRefusedError): sys.exit() if __name__ == "__main__": main() ```
issue