qemu-img cannot repair a qcow2 in an LV because size is mis-detected when qcow2 is on an LV

Host environment

  • Operating system: (Windows 10 21H1, Fedora 34, etc.)
  • OS/kernel version: (For POSIX hosts, use uname -a)
  • Architecture: (x86, ARM, s390x, etc.)
  • QEMU flavor: (qemu-system-x86_64, qemu-aarch64, qemu-img, etc.)
  • QEMU version: (e.g. qemu-system-x86_64 --version)
  • QEMU command line:
    ./qemu-system-x86_64 -M q35 -m 4096 -enable-kvm -hda fedora32.qcow2

Emulated/Virtualized environment

  • Operating system: (Windows 10 21H1, Fedora 34, etc.)
  • OS/kernel version: (For POSIX guests, use uname -a.)
  • Architecture: (x86, ARM, s390x, etc.)

Description of problem

This is RHEV with Tb's of VMs which need to be repaired due to a datacenter-wide (the real datacenter) power outage.

Each of these VMs are on individual LVs but qemu-img check fails to perform repairs:

ERROR cluster 24481205 refcount=0 reference=1
ERROR cluster 24481206 refcount=0 reference=1
Rebuilding refcount structure
ERROR writing refblock: No space left on device <============
qemu-img: Check failed: No space left on device

Running qemu-img check or info on the LV (/dev/dm-*) works well but repairs cannot be completed:

# qemu-img info /dev/cdd4e215-8c6b-4877-b2be-fdba383e7eb0/fb32333b-2334-4e10-8c42-02bc97e826cc
image: /dev/cdd4e215-8c6b-4877-b2be-fdba383e7eb0/fb32333b-2334-4e10-8c42-02bc97e826cc
file format: qcow2
virtual size: 1.5 TiB (1649267441664 bytes)
disk size: 0 B <================================
cluster_size: 65536
Format specific information:
    compat: 1.1
    compression type: zlib
    lazy refcounts: true
    refcount bits: 16
    corrupt: false
    extended l2: false

Steps to reproduce

  1. Have a damaged VM with its qcow2 in an LV
  2. run 'qemu-img check ' verify that it properly detects the blocks which need fixing.
  3. run 'qemu-img check -r all ', it exits with 'no space left on device´ after a few seconds.

Additional information

https://bugzilla.redhat.com/show_bug.cgi?id=1519071

Here is one example:

Edited by Vincent Cojot
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information