Skip to content

qemu-ga fsfreeze crashes the kernel

Host environment

  • Operating system: Proxmox 6.4
  • OS/kernel version: Linux pve1 5.4.114-1-pve
  • Architecture: x86_64
  • QEMU flavor: qemu-system-x86_64
  • QEMU version: QEMU emulator version 5.2.0 (pve-qemu-kvm_5.2.0)
  • QEMU command line: qemu-ga fs-freeze

Emulated/Virtualized environment

  • Operating system: CentOS 7
  • OS/kernel version: 3.10.0-1160.36.2.el7.x86_64
  • Architecture: x86_64

Description of problem

Hello,

Still required your attention, duplicate from: https://bugs.launchpad.net/bugs/1807073 https://bugs.launchpad.net/bugs/1813045

We use mainly Cloudlinux, Debian and Centos. We experienced many crashes on our qemu instances based on Cloudlinux during a snapshot. The issue is not related to CloudLinux directly, but to Qemu agent, which does not freeze the file system(s) correctly. What is actually happening:

When VM backup is invoked, Qemu agent freezes the file systems, so no single change will be made during the backup. But Qemu agent does not respect the loop* devices in freezing order (we have checked its sources), which leads to the next situation:

  1. freeze loopback fs ---> send async reqs to loopback thread
  2. freeze main fs
  3. loopback thread wakes up and trying to write data to the main fs, which is still frozen, and this finally leads to the hung task and kernel crash.

Moreover, a lot of Proxmox users are complaining about the issue as well: https://forum.proxmox.com/threads/error-vm-100-qmp-command-guest-fsfreeze-thaw-failed-got-timeout.68082/ https://forum.proxmox.com/threads/problem-with-fsfreeze-freeze-and-qemu-guest-agent.65707/

Steps to reproduce

  1. Manually start backup for the VM with qemu-agent enabled.
  2. The backup process stuck at "INFO: issuing guest-agent 'fs-freeze' command"
  3. The VM become unavailable, you can only unlock it and force reset.

Additional information

/var/log/messages logs:
Aug 6 21:54:00 cpanel qemu-ga: info: guest-ping called
Aug 6 21:54:01 cpanel qemu-ga: info: guest-fsfreeze called
Aug 6 21:54:01 cpanel qemu-ga: info: executing fsfreeze hook with arg 'freeze'

after this the VM becomes completely unavailable.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information