Skip to content

RISC-V: Instruction fetch exceptions can have invalid tval/epc combination

Host environment

  • Operating system: NixOS (unstable)
  • OS/kernel version: Linux 5.19.0
  • Architecture: x86_64
  • QEMU flavor: qemu-system-riscv64
  • QEMU version: 7.0.0 or master (a6b1c53e)
  • QEMU command line:
    qemu-system-riscv64 -m 512M -M virt -nographic -kernel Image -append "earlycon=sbi" -initrd initrd.cpio -d int -D log.txt

Emulated/Virtualized environment

  • Operating system: Linux 5.19
  • OS/kernel version: Linux 5.19
  • Architecture: riscv64

Description of problem

Instruction page fault / guest-page fault / access fault exceptions can have invalid epc/tval combinations, for example as shown in the debug log:

riscv_cpu_do_interrupt: hart:0, async:0, cause:0000000000000014, epc:0xffffffff802fec76, tval:0xffffffff802ff000, desc=guest_exec_page_fault
riscv_cpu_do_interrupt: hart:0, async:0, cause:0000000000000014, epc:0xffffffff80243fe6, tval:0xffffffff80244000, desc=guest_exec_page_fault

From the privileged spec:

If mtval is written with a nonzero value when an instruction access-fault or page-fault exception occurs on a system with variable-length instructions, then mtval will contain the virtual address of the portion of the instruction that caused the fault, while mepc will point to the beginning of the instruction.

Currently RISC-V only has 32-bit and 16-bit instructions, so the difference tval - epc should be either 0 or 2. In the examples above the differences are 906 and 26 respectively.

Possibly notable: all occurrences of these invalid combinations to have tval aligned to a page-boundary.

Steps to reproduce

This one only gives invalid tval/epc combinations with instruction guest-page faults, but I've found it to be the easiest reproducer to describe, since presumably running KVM in RISC-V QEMU is a standard setup. I have not otherwise been able to find a more minimal case.

  1. Start a QEMU-based riscv64 machine
  2. Start a KVM-based virtual machine with QEMU inside it
  3. Do some stuff in the KVM-based virtual machine to increase the chance of page faults
  4. Look in the debug log of the outer QEMU for guest_exec_page_fault exceptions with tval ending in 000, but epc ending in neither 000 nor ffe

Everything in both layers of guests should otherwise work without issue, but other/future software that relies on the spec-mandated relationship of epc/tval may break.

Additional information

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information