Skip to content

QEMU tests broken on macOS due to bad file descriptors

Software environment

  • Operating system: macos 11
  • Architecture: x86_64
  • libvirt version: HEAD

Description of problem

Despite no change in libvirt code, macOS 11 CI jobs suddenly started failing QEMU tests

The last working job was:

https://gitlab.com/libvirt/libvirt/-/jobs/2361318722

The first broken job was:

https://gitlab.com/libvirt/libvirt/-/jobs/2364132896

The git commit HEAD is the same in both jobs. The logs show, however, that glib was updated from 2.72.0 to 2.72.1

Running a job with --print-errorlogs:

https://gitlab.com/berrange/libvirt/-/jobs/2376891003

we see all the tests failing are printing

(process:50961): GLib-WARNING **: 01:56:14.162: poll(2) failed due to: Bad file descriptor.

If I look in glib2 git logs I see

https://gitlab.gnome.org/GNOME/glib/-/merge_requests/2571

commit 2215c88d76f944d5ed38a26fbe79a20e5e16bfe6
Merge: 01a1caa93 be5acbb5e
Author: Philip Withnall <philip@tecnocode.co.uk>
Date:   Mon Mar 28 11:59:59 2022 +0000

    Merge branch 'macos-broken-poll' into 'main'
    
    meson: Set BROKEN_POLL in macOS builds
    
    See merge request GNOME/glib!2571

IOW, glib switched from using macOS native poll to using select to emulate poll.

A key difference between these impls is handling of bad file descriptors. With a real poll impl, it will succeed and fill POLLNVAL as the event type, with a select based emulation, it will fail with EBADF errno.

I suspect this is a genuine latent bug in libvirt. We're likely prematurely closing a file descriptor and previously this was harmlessly reporting POLLNVAL which we're probably ignoring, but now it is causing the fatal EBADF.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information