Skip to content

Replay/record does not work with `rrsnapshot`/`loadvm`

Host environment

  • Operating system: Ubuntu 20.04.6 LTS
  • OS/kernel version: Linux ub20045.4.0-153-generic #170-Ubuntu SMP Fri Jun 16 13:43:31 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
  • Architecture: x86_64
  • QEMU flavor: qemu-system-x86_64
  • QEMU version: 9.1.0
  • QEMU command line:
$ qemu-system-x86_64 \
  -cpu SandyBridge -smp 1 \
  -serial stdio -display none \
  -m 4096 \
  -drive file=./empty.qcow2,id=rr \
  -kernel ./boot/vmlinuz-lts \
  -initrd ./boot/initramfs-lts  .
  -monitor telnet::12345,server,nowait \
  -append "console=ttyS0 root=/dev/ram0 alpine_dev=cdrom:iso9660 modules=loop,squashfs,sd-mod,usb-storage quiet" \
  -icount shift=auto,rrfile=rr,rr=replay,rrsnapshot=init

Emulated/Virtualized environment

  • Operating system: alpine-standard-3.20.3-x86_64.iso
  • OS/kernel version: -
  • Architecture: x86_64

Description of problem

Qemu's record/replay feature does not properly work when using snapshots (like rrsnapshot).

Record/replay without snapshotting works just fine, but when using rrsnapshot=... the replay is stuck at boot. loadvm monitor command also gets qemu stuck.

Record command:

$ qemu-system-x86_64 \
  -cpu SandyBridge -smp 1 \
  -serial stdio -display none \
  -m 4096 \
  -drive file=./empty.qcow2,id=rr \
  -kernel ./boot/vmlinuz-lts \
  -initrd ./boot/initramfs-lts  .
  -monitor telnet::12345,server,nowait \
  -append "console=ttyS0 root=/dev/ram0 alpine_dev=cdrom:iso9660 modules=loop,squashfs,sd-mod,usb-storage quiet" \
  -icount shift=auto,rrfile=rr,rr=record,rrsnapshot=init

Broken replay command, which gets qemu stuck:

$ qemu-system-x86_64 \
  -cpu SandyBridge -smp 1 \
  -serial stdio -display none \
  -m 4096 \
  -drive file=./empty.qcow2,id=rr \
  -kernel ./boot/vmlinuz-lts \
  -initrd ./boot/initramfs-lts  .
  -monitor telnet::12345,server,nowait \
  -append "console=ttyS0 root=/dev/ram0 alpine_dev=cdrom:iso9660 modules=loop,squashfs,sd-mod,usb-storage quiet" \
  -icount shift=auto,rrfile=rr,rr=replay,rrsnapshot=init

qemu-system-x86_64: warning: TCG doesn't support requested feature: CPUID.01H:ECX.tsc-deadline [bit 24]

Record/replay without rrsnapshot/loadvm/etc works as expected.

Steps to reproduce

To reproduce i've used alpine linux kernel as the guest:

wget https://dl-cdn.alpinelinux.org/alpine/v3.20/releases/x86_64/alpine-standard-3.20.3-x86_64.iso
7z x alpine-standard-3.20.3-x86_64.iso

Prerequisites - an empty qcow2 file for snapshots:

qemu-img create -f qcow2 empty.qcow2 1G

Running an alpine linux kernel with rr=record - works just fine, kernel boots, accepts input.

$ qemu-system-x86_64 \
  -cpu SandyBridge -smp 1 \
  -serial stdio -display none \
  -m 4096 \
  -drive file=./empty.qcow2,id=rr \
  -kernel ./boot/vmlinuz-lts \
  -initrd ./boot/initramfs-lts  .
  -monitor telnet::12345,server,nowait \
  -append "console=ttyS0 root=/dev/ram0 alpine_dev=cdrom:iso9660 modules=loop,squashfs,sd-mod,usb-storage quiet" \
  -icount shift=auto,rrfile=rr,rr=record,rrsnapshot=init

qemu-system-x86_64: warning: TCG doesn't support requested feature: CPUID.01H:ECX.tsc-deadline [bit 24]
mount: mounting /dev/ram0 on /sysroot failed: Invalid argument
Mounting root failed. 
initramfs emergency recovery shell launched. Type 'exit' to continue boot
sh: can't access tty; job control turned off
~ # ls -alh
total 32K    
drwx------   18 root     root           0 Oct 21 13:02 .
drwx------   18 root     root           0 Oct 21 13:02 ..
-rw-------    1 root     root           8 Oct 21 13:02 .ash_history
drwxr-xr-x    2 root     root           0 Jun 18 12:44 .modloop
drwxr-xr-x    2 root     root           0 Oct 21 13:02 bin
drwxr-xr-x    9 root     root        2.5K Oct 21 13:02 dev
drwxr-xr-x    4 root     root           0 Oct 21 13:02 etc
-rwxr-xr-x    1 root     root       25.9K Jun 18 12:44 init
drwxr-xr-x    5 root     root           0 Jun 18 12:44 lib
drwxr-xr-x    5 root     root           0 Jun 18 12:44 media
drwxr-xr-x    2 root     root           0 Jun 18 12:44 newroot
dr-xr-xr-x  114 root     root           0 Oct 21 13:02 proc
drwx------    2 root     root           0 Sep  4 12:53 root
drwxr-xr-x    3 root     root           0 Oct 21 13:02 run
drwxr-xr-x    2 root     root           0 Oct 21 13:02 sbin
dr-xr-xr-x   13 root     root           0 Oct 21 13:02 sys
drwxr-xr-x    2 root     root           0 Oct 21 13:02 sysroot
drwxr-xr-x    2 root     root           0 Oct 21 13:02 tmp
drwxr-xr-x    5 root     root           0 Oct 21 13:02 usr
drwxr-xr-x    3 root     root           0 Jun 18 12:44 var
~ # echo "AAAAAAAA?"
AAAAAAAA?
~ # 

rr-file is produced, which can be used for replaying without rrsnapshot-option:

$ qemu-system-x86_64 \
  -cpu SandyBridge -smp 1 \
  -serial stdio -display none \
  -m 4096 \
  -drive file=./empty.qcow2,id=rr \
  -kernel ./boot/vmlinuz-lts \
  -initrd ./boot/initramfs-lts  .
  -monitor telnet::12345,server,nowait \
  -append "console=ttyS0 root=/dev/ram0 alpine_dev=cdrom:iso9660 modules=loop,squashfs,sd-mod,usb-storage quiet" \
  -icount shift=auto,rrfile=rr,rr=replay

qemu-system-x86_64: warning: TCG doesn't support requested feature: CPUID.01H:ECX.tsc-deadline [bit 24]
mount: mounting /dev/ram0 on /sysroot failed: Invalid argument
Mounting root failed. 
initramfs emergency recovery shell launched. Type 'exit' to continue boot
sh: can't access tty; job control turned off
~ # ls -alh
total 32K    
drwx------   18 root     root           0 Oct 21 13:02 .
drwx------   18 root     root           0 Oct 21 13:02 ..
-rw-------    1 root     root           8 Oct 21 13:02 .ash_history
drwxr-xr-x    2 root     root           0 Jun 18 12:44 .modloop
drwxr-xr-x    2 root     root           0 Oct 21 13:02 bin
drwxr-xr-x    9 root     root        2.5K Oct 21 13:02 dev
drwxr-xr-x    4 root     root           0 Oct 21 13:02 etc
-rwxr-xr-x    1 root     root       25.9K Jun 18 12:44 init
drwxr-xr-x    5 root     root           0 Jun 18 12:44 lib
drwxr-xr-x    5 root     root           0 Jun 18 12:44 media
drwxr-xr-x    2 root     root           0 Jun 18 12:44 newroot
dr-xr-xr-x  114 root     root           0 Oct 21 13:02 proc
drwx------    2 root     root           0 Sep  4 12:53 root
drwxr-xr-x    3 root     root           0 Oct 21 13:02 run
drwxr-xr-x    2 root     root           0 Oct 21 13:02 sbin
dr-xr-xr-x   13 root     root           0 Oct 21 13:02 sys
drwxr-xr-x    2 root     root           0 Oct 21 13:02 sysroot
drwxr-xr-x    2 root     root           0 Oct 21 13:02 tmp
drwxr-xr-x    5 root     root           0 Oct 21 13:02 usr
drwxr-xr-x    3 root     root           0 Jun 18 12:44 var
~ # echo "AAAAAAAA?"
AAAAAAAA?
~ # 

As you can see, replaying emulation session works as expected. How ever, if I add the rrsnapshot-option, it gets stuck:

$ qemu-system-x86_64 \
  -cpu SandyBridge -smp 1 \
  -serial stdio -display none \
  -m 4096 \
  -drive file=./empty.qcow2,id=rr \
  -kernel ./boot/vmlinuz-lts \
  -initrd ./boot/initramfs-lts  .
  -monitor telnet::12345,server,nowait \
  -append "console=ttyS0 root=/dev/ram0 alpine_dev=cdrom:iso9660 modules=loop,squashfs,sd-mod,usb-storage quiet" \
  -icount shift=auto,rrfile=rr,rr=replay,rrsnapshot=init

qemu-system-x86_64: warning: TCG doesn't support requested feature: CPUID.01H:ECX.tsc-deadline [bit 24] 

This also can be reproduced without rrsnapshot option, by issuing loadvm init from qemu monitor:

$ telnet localhost 12345
qemu> loadvm init
...

Or, by using gdb and issuing reverse-commands that require loadvm to load previous state, like reverse-stepi or reverse-continue.

Attaching a debugger & using debug-prints shows some thread being stuck in the rcu.c, near the qemu_event_wait(&rcu_call_ready_event);. I've tried to wait for quite some time (about an hour) and there was no result.

Additional information

Qemu build. Qemu binary built from sources of 9.1.0 with --target-list=x86_64-softmmu.

Host machine. An almost clean Ubuntu 20.04 with necessary packages for building qemu from the latest release sources.

Edited by kotborealis
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information