QEMU tests broken on macOS due to bad file descriptors
Software environment
- Operating system: macos 11
- Architecture: x86_64
- libvirt version: HEAD
Description of problem
Despite no change in libvirt code, macOS 11 CI jobs suddenly started failing QEMU tests
The last working job was:
https://gitlab.com/libvirt/libvirt/-/jobs/2361318722
The first broken job was:
https://gitlab.com/libvirt/libvirt/-/jobs/2364132896
The git commit HEAD is the same in both jobs. The logs show, however, that glib was updated from 2.72.0 to 2.72.1
Running a job with --print-errorlogs:
https://gitlab.com/berrange/libvirt/-/jobs/2376891003
we see all the tests failing are printing
(process:50961): GLib-WARNING **: 01:56:14.162: poll(2) failed due to: Bad file descriptor.
If I look in glib2 git logs I see
https://gitlab.gnome.org/GNOME/glib/-/merge_requests/2571
commit 2215c88d76f944d5ed38a26fbe79a20e5e16bfe6
Merge: 01a1caa93 be5acbb5e
Author: Philip Withnall <philip@tecnocode.co.uk>
Date: Mon Mar 28 11:59:59 2022 +0000
Merge branch 'macos-broken-poll' into 'main'
meson: Set BROKEN_POLL in macOS builds
See merge request GNOME/glib!2571
IOW, glib switched from using macOS native poll
to using select
to emulate poll.
A key difference between these impls is handling of bad file descriptors. With a real poll
impl, it will succeed and fill POLLNVAL as the event type, with a select
based emulation, it will fail with EBADF errno.
I suspect this is a genuine latent bug in libvirt. We're likely prematurely closing a file descriptor and previously this was harmlessly reporting POLLNVAL which we're probably ignoring, but now it is causing the fatal EBADF.