linux-user hangs if fd_trans_lock is held during fork
We received user reports of a hang inside a cross-architecture Docker container (using qemu-user-static) when running Python bytecode compilation in uv:
I believe I've minimized it to the test case below, which hangs for me on same-architecture (aarch64 on aarch64) qemu-user with high probability. The basic problem seems to be that the fd_trans code has a pthread lock (introduced in c093364f), but qemu-user does a fork-without-exec in response to the emulated process calling fork, and if you fork while another thread holds a pthread lock, that lock remains permanently stuck locked. So the new process hangs on entering the fd_trans code.
Here's the test program. The gist of it is to spawn some threads that are repeatedly doing close(dup(2)), and also fork some children that are doing close(dup(2)), until you've forked a child while another thread in the parent was holding the fd_trans_lock. At that point, the child is deadlocked as soon as it tries to create/close/change file descriptors.
#include <sys/wait.h>
#include <pthread.h>
#include <stdio.h>
#include <unistd.h>
void *mythread(void *) {
for (int i = 0; i < 500000; i++) {
int fd = dup(2);
close(fd);
}
return NULL;
}
int main(void) {
pthread_t thread[10];
for (int i = 0; i < 10; i++)
pthread_create(&thread[i], NULL, mythread, NULL);
usleep(30000);
printf("Starting children\n");
for (int i = 0; i < 10000; i++) {
int pid = fork();
if (pid == 0) {
fprintf(stderr, "child %d {", i);
int fd = dup(2);
close(fd);
fprintf(stderr, "} %d\n", i);
return 0;
} else {
if (waitpid(pid, NULL, 0) != pid)
perror("waitpid");
}
}
printf("Children done\n");
for (int i = 0; i < 10; i++)
pthread_join(thread[i], NULL);
printf("All done\n");
}
Compile with cc -o race race.c and run with qemu-$(uname -m) ./race. If it doesn't hang after a few tries try tweaking the constants (the goal is to have the threads be running while you fork a child). Here's the backtrace I get:
(gdb) t a a bt
Thread 2 (Thread 0xffff8c63eea0 (LWP 47948) "qemu-aarch64"):
#0 0x0000ffff8c7d6a24 in syscall () from /nix/store/q6kvdfhlp251n8rx183lg1n245kwg6m8-glibc-2.39-52/lib/libc.so.6
#1 0x0000aaaab82b0090 in qemu_futex_wait (val=<optimized out>, f=<optimized out>) at /qemu-9.2.2/include/qemu/futex.h:29
#2 qemu_event_wait (ev=ev@entry=0xaaaab848bc90 <rcu_call_ready_event>) at ../util/qemu-thread-posix.c:464
#3 0x0000aaaab82b911c in call_rcu_thread (opaque=opaque@entry=0x0) at ../util/rcu.c:278
#4 0x0000aaaab82aeacc in qemu_thread_start (args=<optimized out>) at ../util/qemu-thread-posix.c:541
#5 0x0000ffff8c77566c in start_thread () from /nix/store/q6kvdfhlp251n8rx183lg1n245kwg6m8-glibc-2.39-52/lib/libc.so.6
#6 0x0000ffff8c7d8c8c in thread_start () from /nix/store/q6kvdfhlp251n8rx183lg1n245kwg6m8-glibc-2.39-52/lib/libc.so.6
Thread 1 (Thread 0xffff8cb56020 (LWP 47947) "qemu-aarch64"):
#0 0x0000ffff8c77210c in __lll_lock_wait () from /nix/store/q6kvdfhlp251n8rx183lg1n245kwg6m8-glibc-2.39-52/lib/libc.so.6
#1 0x0000ffff8c7787fc in pthread_mutex_lock@@GLIBC_2.17 () from /nix/store/q6kvdfhlp251n8rx183lg1n245kwg6m8-glibc-2.39-52/lib/libc.so.6
#2 0x0000aaaab82aef30 in qemu_mutex_lock_impl (mutex=0xaaaab8480c78 <target_fd_trans_lock>, file=0xaaaab82f6c70 "/qemu-9.2.2/include/qemu/lockable.h", line=56) at ../util/qemu-thread-posix.c:94
#3 0x0000aaaab82499f8 in qemu_lockable_mutex_lock (x=<optimized out>) at /qemu-9.2.2/include/qemu/lockable.h:56
#4 qemu_lockable_lock (x=<optimized out>) at /qemu-9.2.2/include/qemu/lockable.h:110
#5 qemu_lockable_auto_lock (x=<optimized out>) at /qemu-9.2.2/include/qemu/lockable.h:120
#6 fd_trans_target_to_host_data (fd=2) at ../linux-user/fd-trans.h:45
#7 do_syscall1 (cpu_env=cpu_env@entry=0xaaaae86ea9d0, num=num@entry=64, arg1=arg1@entry=2, arg2=arg2@entry=281473028106568, arg3=arg3@entry=9, arg4=arg4@entry=0, arg5=arg5@entry=4294967295, arg6=arg6@entry=4222429319, arg8=0, arg7=0) at ../linux-user/syscall.c:9271
#8 0x0000aaaab824c870 in do_syscall (cpu_env=cpu_env@entry=0xaaaae86ea9d0, num=64, arg1=2, arg2=281473028106568, arg3=9, arg4=0, arg5=4294967295, arg6=4222429319, arg7=arg7@entry=0, arg8=arg8@entry=0) at ../linux-user/syscall.c:13884
#9 0x0000aaaab8005f30 in cpu_loop (env=env@entry=0xaaaae86ea9d0) at ../linux-user/aarch64/cpu_loop.c:95
#10 0x0000aaaab8000c98 in main (argc=2, argv=0xffffd11e2438, envp=<optimized out>) at ../linux-user/main.c:1037
This was reported with qemu-user-static 8.2.2 from Ubuntu 24.04. I reproduced it there, and then minimized and retested with 9.2.2 built from source, which is where the backtrace above is from. I don't see any relevant code changes in HEAD that would change this but I'm happy to retest with HEAD if you'd like.
Host environment
- Operating system: reported on Ubuntu 22.04 and 24.04 x86_64, reproduced on NixOS 24.05 aarch64
- OS/kernel version:
Linux nixos 6.6.48 #1-NixOS SMP Thu Aug 29 15:33:59 UTC 2024 aarch64 GNU/Linux - Architecture: x86-64 or aarch64
- QEMU flavor: qemu-aarch64 (though I believe it would happen with other user targets)
- QEMU version: reported on
qemu-aarch64 version 8.2.2 (Debian 1:8.2.2+ds-0ubuntu1.5), reproduced onqemu-aarch64 version 9.2.2 - QEMU command line:
qemu-aarch64 ./race
Emulated/Virtualized environment
- Operating system: qemu-user
- OS/kernel version: qemu-user
- Architecture: aarch64