QEMU tests broken on macOS due to bad file descriptors

Software environment

  • Operating system: macos 11
  • Architecture: x86_64
  • libvirt version: HEAD

Description of problem

Despite no change in libvirt code, macOS 11 CI jobs suddenly started failing QEMU tests

The last working job was:

https://gitlab.com/libvirt/libvirt/-/jobs/2361318722

The first broken job was:

https://gitlab.com/libvirt/libvirt/-/jobs/2364132896

The git commit HEAD is the same in both jobs. The logs show, however, that glib was updated from 2.72.0 to 2.72.1

Running a job with --print-errorlogs:

https://gitlab.com/berrange/libvirt/-/jobs/2376891003

we see all the tests failing are printing

(process:50961): GLib-WARNING **: 01:56:14.162: poll(2) failed due to: Bad file descriptor.

If I look in glib2 git logs I see

https://gitlab.gnome.org/GNOME/glib/-/merge_requests/2571

commit 2215c88d76f944d5ed38a26fbe79a20e5e16bfe6
Merge: 01a1caa93 be5acbb5e
Author: Philip Withnall <philip@tecnocode.co.uk>
Date:   Mon Mar 28 11:59:59 2022 +0000

    Merge branch 'macos-broken-poll' into 'main'
    
    meson: Set BROKEN_POLL in macOS builds
    
    See merge request GNOME/glib!2571

IOW, glib switched from using macOS native poll to using select to emulate poll.

A key difference between these impls is handling of bad file descriptors. With a real poll impl, it will succeed and fill POLLNVAL as the event type, with a select based emulation, it will fail with EBADF errno.

I suspect this is a genuine latent bug in libvirt. We're likely prematurely closing a file descriptor and previously this was harmlessly reporting POLLNVAL which we're probably ignoring, but now it is causing the fatal EBADF.