RISC-V: Instruction fetch exceptions can have invalid tval/epc combination
Host environment
- Operating system: NixOS (unstable)
- OS/kernel version: Linux 5.19.0
- Architecture: x86_64
- QEMU flavor: qemu-system-riscv64
- QEMU version: 7.0.0 or master (a6b1c53e)
- QEMU command line:
qemu-system-riscv64 -m 512M -M virt -nographic -kernel Image -append "earlycon=sbi" -initrd initrd.cpio -d int -D log.txt
Emulated/Virtualized environment
- Operating system: Linux 5.19
- OS/kernel version: Linux 5.19
- Architecture: riscv64
Description of problem
Instruction page fault / guest-page fault / access fault exceptions can have invalid epc/tval combinations, for example as shown in the debug log:
riscv_cpu_do_interrupt: hart:0, async:0, cause:0000000000000014, epc:0xffffffff802fec76, tval:0xffffffff802ff000, desc=guest_exec_page_fault
riscv_cpu_do_interrupt: hart:0, async:0, cause:0000000000000014, epc:0xffffffff80243fe6, tval:0xffffffff80244000, desc=guest_exec_page_fault
From the privileged spec:
If
mtvalis written with a nonzero value when an instruction access-fault or page-fault exception occurs on a system with variable-length instructions, thenmtvalwill contain the virtual address of the portion of the instruction that caused the fault, whilemepcwill point to the beginning of the instruction.
Currently RISC-V only has 32-bit and 16-bit instructions, so the difference tval - epc should be either 0 or 2. In the examples above the differences are 906 and 26 respectively.
Possibly notable: all occurrences of these invalid combinations to have tval aligned to a page-boundary.
Steps to reproduce
This one only gives invalid tval/epc combinations with instruction guest-page faults, but I've found it to be the easiest reproducer to describe, since presumably running KVM in RISC-V QEMU is a standard setup. I have not otherwise been able to find a more minimal case.
- Start a QEMU-based
riscv64machine - Start a KVM-based virtual machine with QEMU inside it
- Do some stuff in the KVM-based virtual machine to increase the chance of page faults
- Look in the debug log of the outer QEMU for
guest_exec_page_faultexceptions withtvalending in000, butepcending in neither000norffe
Everything in both layers of guests should otherwise work without issue, but other/future software that relies on the spec-mandated relationship of epc/tval may break.