x86_64: SYSRETQ does not raise #GP(0) when QEMU emulates Intel CPU and RCX contains a non-canonical address
## Host environment - Operating system: Kali Linux 2025.4 - OS/kernel version: Linux kali 6.16.8+kali-amd64 #1 SMP PREEMPT_DYNAMIC Kali 6.16.8-1kali1 (2025-09-24) x86_64 GNU/Linux - Architecture: x86_64 - QEMU flavor: qemu-system-x86_64 - QEMU version: 10.1.91 (v10.2.0-rc1-11-g5a5b06d2f6-dirty) (built from source) - QEMU command line: ``` ./qemu-system-x86_64 -cpu <any intel cpu> ./poc ``` ## Emulated/Virtualized environment - Operating system: Does not matter - OS/kernel version: Does not matter - Architecture: x86_64 ## Description of problem When QEMU emulates an Intel CPU, [`helper_sysret`](https://gitlab.com/qemu-project/qemu/-/blob/master/target/i386/tcg/seg_helper.c#L1085) does not check if RCX contains a non-canonical address when using 64-bit operand size. Because of that, no exception is raised when QEMU emulates Intel CPU and RCX contains a non-canonical address at the time `sysretq` is executed, although Intel SDM explicitly states that `#GP(0)` should be raised in this case. ## Steps to reproduce 1. Compile the following source code with `nasm -f bin ./poc.asm`: ```nasm bits 16 org 0x7c00 %define CR0_PE 1 %define CR0_WP (1<<16) %define CR0_PG (1<<31) %define CR4_PAE (1<<5) %define MSR_EFER 0xc0000080 %define EFER_SCE 1 %define EFER_LME (1<<8) %define EFER_NXE (1<<11) %define MSR_STAR 0xc0000081 %define KERNEL_CS 0x8 %define USER_CS (0x18|3) cli xor ebx, ebx mov ds, ebx mov fs, ebx mov gs, ebx mov ss, ebx mov esp, 0x7c00 jmp 0:reload_cs_16 reload_cs_16: mov eax, 0x7e0 mov es, eax xor dh, dh mov ecx, 2 mov ah, 2 mov al, (part2_end - part2 + 511) / 512 int 0x13 jc $ cli mov esp, 0x7000 mov eax, pml4 mov cr3, eax mov ecx, MSR_EFER rdmsr or eax, EFER_LME | EFER_NXE | EFER_SCE wrmsr mov ecx, MSR_STAR mov edx, (USER_CS << 16) | KERNEL_CS xor eax, eax wrmsr mov eax, cr4 or eax, CR4_PAE mov cr4, eax mov eax, cr0 or eax, CR0_PG | CR0_WP | CR0_PE mov cr0, eax lgdt [gdtr] lidt [idtr] jmp KERNEL_CS:reload_cs_64 bits 64 reload_cs_64: xor eax, eax mov ds, eax mov es, eax mov fs, eax mov gs, eax mov eax, 0x10 mov ss, eax mov eax, 0x30 ltr eax mov ecx, 1 shl rcx, 48 pushfq pop r11 ; nasm way to assemble sysretq o64 sysret %macro CREATE_EXCEPTION_HANDLER 2 %1: push qword %2 .halt_loop: cli hlt jmp .halt_loop %endmacro CREATE_EXCEPTION_HANDLER handle_double_fault, 8 CREATE_EXCEPTION_HANDLER handle_general_protection_fault, 13 CREATE_EXCEPTION_HANDLER handle_page_fault, 14 times 510 - ($-$$) db 0 dw 0xaa55 part2: %define OFFSET_TO_ADDR(x) times ((x) - 0x7c00 - ($-$$)) db 0 %define ADDROF(x) ((x) - $$ + 0x7c00) %define DATA_START_ADDR 0x8000 %define PT_START_ADDR DATA_START_ADDR+0x1000 OFFSET_TO_ADDR(DATA_START_ADDR) %define GDT_DATA_WRITABLE (1<<9) %define GDT_CODE_SEGMENT (1<<11) %define GDT_CODE_OR_DATA_SEGMENT (1<<12) %define GDT_DPL_3 (3<<13) %define GDT_PRESENT (1<<15) %define GDT_64BIT_CODE_SEGMENT (1<<21) %define GDT_32BIT_SEGMENT (1<<22) %define GDT_GRANULARITY (1<<23) %define GDT_TYPE_TSS_AVAILABLE ((1<<11)|(1<<8)) gdt: dq 0 ; 0x8 - kernel code (64-bit) dd 0xffff dd (0xf << 16) | GDT_PRESENT | GDT_GRANULARITY | GDT_CODE_OR_DATA_SEGMENT | GDT_CODE_SEGMENT | GDT_64BIT_CODE_SEGMENT ; 0x10 - kernel data (writable) dd 0xffff dd (0xf << 16) | GDT_PRESENT | GDT_GRANULARITY | GDT_CODE_OR_DATA_SEGMENT | GDT_32BIT_SEGMENT | GDT_DATA_WRITABLE ; 0x18 - not present dq 0 ; 0x20 - user data (writable) dd 0xffff dd (0xf << 16) | GDT_PRESENT | GDT_GRANULARITY | GDT_CODE_OR_DATA_SEGMENT | GDT_32BIT_SEGMENT | GDT_DATA_WRITABLE | GDT_DPL_3 ; 0x28 - user code (64-bit) dd 0xffff dd (0xf << 16) | GDT_PRESENT | GDT_GRANULARITY | GDT_CODE_OR_DATA_SEGMENT | GDT_CODE_SEGMENT | GDT_64BIT_CODE_SEGMENT | GDT_DPL_3 ; 0x30 - TSS dw tss.end - tss dw tss dd (ADDROF(tss) >> 16) | GDT_PRESENT | GDT_TYPE_TSS_AVAILABLE | (ADDROF(tss) & 0xff000000) dd ADDROF(tss) >> 32 dd 0 gdtr: .limit: dw gdtr - gdt - 1 .base: dq gdt %define IDT_PRESENT (1<<15) %define IDT_TRAP_GATE ((1<<11)|(1<<10)|(1<<9)|(1<<8)) %macro MAKE_IDT_ENTRY 1 dw ADDROF(%1) & 0xffff dw KERNEL_CS dw IDT_PRESENT | IDT_TRAP_GATE dq ADDROF(%1) >> 16 dw 0 %endmacro idt: times 2*(7-0+1) dq 0 ; entries 0 - 7: not present MAKE_IDT_ENTRY handle_double_fault times 2*(12-9+1) dq 0 ; entries 9 - 12: not present MAKE_IDT_ENTRY handle_general_protection_fault MAKE_IDT_ENTRY handle_page_fault idtr: .limit: dw idtr - idt - 1 .base: dq idt tss: .res0: dd 0 .rsp0: dq 0x7000 .rsp1: dq 0 .rsp2: dq 0 .res1: dq 0 .ist1: dq 0 .ist2: dq 0 .ist3: dq 0 .ist4: dq 0 .ist5: dq 0 .ist6: dq 0 .ist7: dq 0 .res2: dq 0 dw 0 .io_map_base: dw tss.end - tss .end: OFFSET_TO_ADDR(PT_START_ADDR) %define PAGE_PRESENT 1 %define PAGE_WRITABLE (1<<1) %define PAGE_USER (1<<2) %define PAGE_NOEXECUTE (1<<63) pml4: dq ADDROF(pdpt) | PAGE_PRESENT | PAGE_WRITABLE | PAGE_USER OFFSET_TO_ADDR(PT_START_ADDR+0x1000) pdpt: dq ADDROF(pd) | PAGE_PRESENT | PAGE_WRITABLE | PAGE_USER OFFSET_TO_ADDR(PT_START_ADDR+0x2000) pd: dq ADDROF(pt) | PAGE_PRESENT | PAGE_WRITABLE | PAGE_USER OFFSET_TO_ADDR(PT_START_ADDR+0x3000) pt: times 6 dq 0 dq 0x6000 | PAGE_PRESENT | PAGE_WRITABLE | PAGE_NOEXECUTE dq 0x7000 | PAGE_PRESENT dq 0x8000 | PAGE_PRESENT | PAGE_WRITABLE | PAGE_NOEXECUTE OFFSET_TO_ADDR(PT_START_ADDR+0x4000) part2_end: ``` 2. Start qemu-system-x86_64: ``` qemu-system-x86_64 -cpu Skylake-Client-v4 -d int ./poc # any Intel CPU can be provided to -cpu ``` 3. Confirm that `#GP(0)` is raised at CPL=3 with RIP containing non-canonical address (which means `sysretq` was successfully executed): ``` check_exception old: 0xffffffff new 0xd 0: v=0d e=0000 i=0 cpl=3 IP=002b:0001000000000000 pc=0001000000000000 SP=0023:0000000000007000 env->regs[R_EAX]=0000000000000030 RAX=0000000000000030 RBX=0000000000000000 RCX=0001000000000000 RDX=00000000001b0008 RSI=0000000000000000 RDI=0000000000000000 RBP=0000000000000000 RSP=0000000000007000 R8 =0000000000000000 R9 =0000000000000000 R10=0000000000000000 R11=0000000000000006 R12=0000000000000000 R13=0000000000000000 R14=0000000000000000 R15=0000000000000000 RIP=0001000000000000 RFL=00000006 [-----P-] CPL=3 II=0 A20=1 SMM=0 HLT=0 ES =0000 0000000000000000 00000000 00000000 CS =002b 0000000000000000 ffffffff 00a0fb00 DPL=3 CS64 [-RA] SS =0023 0000000000000000 ffffffff 00c0f300 DPL=3 DS [-WA] DS =0000 0000000000000000 00000000 00000000 FS =0000 0000000000000000 00000000 00000000 GS =0000 0000000000000000 00000000 00000000 LDT=0000 0000000000000000 0000ffff 00008200 DPL=0 LDT TR =0030 0000000000008144 00000068 00008900 DPL=0 TSS64-avl GDT= 0000000000008000 0000003f IDT= 000000000000804a 000000ef CR0=80010011 CR2=0000000000000000 CR3=0000000000009000 CR4=00000020 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400 CCS=0000000000000004 CCD=0001000000000000 CCO=EFLAGS EFER=0000000000000d01 ``` ### Expected behavior (Can be reproduced only on a host that runs on Intel CPU) If your computer runs on Intel CPU, you can observe how `sysretq` should behave in this case as follows: 1. Compile the source code provided above 2. Start qemu-system-x86_64: ``` qemu-system-x86_64 ./poc -cpu host -enable-kvm -monitor stdio ``` 3. Ensure that CPU is in a halted state (which means that the exception handler was executed, see the code of `CREATE_EXCEPTION_HANDLER` macro) and RCX contains a non-canonical address: ``` (qemu) info registers CPU#0 RAX=0000000000000030 RBX=0000000000000000 RCX=0001000000000000 RDX=00000000001b0008 RSI=0000000000000000 RDI=0000000000000000 RBP=0000000000000000 RSP=0000000000006fc8 R8 =0000000000000000 R9 =0000000000000000 R10=0000000000000000 R11=0000000000000006 R12=0000000000000000 R13=0000000000000000 R14=0000000000000000 R15=0000000000000000 RIP=0000000000007cb7 RFL=00000006 [-----P-] CPL=0 II=0 A20=1 SMM=0 HLT=1 ES =0000 0000000000000000 ffffffff 00c00100 CS =0008 0000000000000000 ffffffff 00a09900 DPL=0 CS64 [--A] SS =0010 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA] DS =0000 0000000000000000 ffffffff 00c00100 FS =0000 0000000000000000 ffffffff 00c00100 GS =0000 0000000000000000 ffffffff 00c00100 LDT=0000 0000000000000000 0000ffff 00008200 DPL=0 LDT TR =0030 0000000000008144 00000068 00008b00 DPL=0 TSS64-busy GDT= 0000000000008000 0000003f IDT= 000000000000804a 000000ef CR0=80010011 CR2=0000000000000000 CR3=0000000000009000 CR4=00000020 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400 EFER=0000000000000d01 <FPU and other unrelated registers are omitted for brevity> ``` 4. View the contents of the stack: ``` (qemu) x/7xg $esp 0000000000006fc8: 0x000000000000000d 0x0000000000000000 0000000000006fd8: 0x0000000000007caa 0x0000000000000008 0000000000006fe8: 0x0000000000010006 0x0000000000007000 0000000000006ff8: 0x0000000000000010 (qemu) ``` - 0x6fc8 contains the exception number that was pushed by the handler - 0x6fd0 contains the error code - 0x6fd8 contains the RIP of the faulting instruction - 0x6fe0 contains the kernel code segment - 0x6fe8 contains the RFLAGS (not relevant) - 0x6ff0 contains the RSP at the time of the exception (not relevant) - 0x6ff8 contains the kernel stack segment (not relevant) 5. Disassemble the faulting instruction: ``` (qemu) x/i 0x7caa 0x00007caa: 48 0f 07 sysretq (qemu) ```
issue