The read syscal returns spurious EFAULT when we are executing code from the same page

Host environment

  • Operating system: Fedora Rawhide
  • OS/kernel version: 6.16.0-0.rc0.250605gec7714e494790.13.fc43.x86_64
  • Architecture: x86_64
  • QEMU flavor: qemu-aarch64
  • QEMU version: qemu-10.0.2-1.fc43
  • QEMU command line: qemu-aarch64-static ./a.out

Emulated/Virtualized environment

  • Operating system: Linux
  • OS/kernel version: userspace emulation
  • Architecture: aarch64

I was observing spurious EFAULT errors when running aarch64 binaries on x86-64 host. I analyzed them and it turned out that the read() syscall returns EFAULT error when we read from a handle to a page and simultaneously execute machine code in the other part of the same page.

Steps to reproduce

  1. compile the attached program with "aarch64-linux-gnu-gcc qemu-bug.c"
  2. run the program with "qemu-aarch64 ./a.out"
  3. you get pid 2483466, mapped at 0x5502a4b000 read returned -1, error Bad address
  4. if you run the program on a real arm64 machine, it doesn't crash

The problem is that when executing the read() syscall, qemu checks if the target buffer is read-only. If it is, it invalides the translated code and switches the buffer to read-write. However, if another thread executes some code from the same page at this point, it switches the page back to read-only and read() returns EFAULT.

Additional information

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <errno.h>
#include <string.h>
#include <pthread.h>
#include <sys/mman.h>

static char *ptr;
static int h;

static void *exec(void *x)
{
        while (1) {
                *(unsigned *)&ptr[4092] = 0xd65f03c0;
                __builtin___clear_cache(ptr + 4092, ptr + 4096);
                ((void (*)(void))&ptr[4092])();
        }
}

static void *reader(void *x)
{
        while (1) {
                int r = read(h, ptr, 1);
                if (r != 1) {
                        printf("read returned %d, error %s\n", r, strerror(errno));
                        exit(1);
                }
        }
}

int main(void)
{
        int r;
        pthread_t t1, t2;
        ptr = mmap(NULL, 4096, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
        if (ptr == MAP_FAILED) perror("mmap"), exit(1);
        printf("pid %d, mapped at %p\n", getpid(), ptr);
        h = open("/dev/zero", O_RDONLY);
        if (h < 0) perror("open"), exit(1);
        r = pthread_create(&t1, NULL, exec, NULL);
        if (r) fprintf(stderr, "pthread_create failed\n"), exit(1);
        r = pthread_create(&t2, NULL, reader, NULL);
        if (r) fprintf(stderr, "pthread_create failed\n"), exit(1);
        pause();
        return 0;
}
Edited by Mikuláš Patočka