Test fails with SIGSEGV because of use-after-free

Host environment

  • Operating system: linux
  • Architecture: x86
  • QEMU flavor: qemu-system-x86_64
  • QEMU version: 8.1.50 (commit d762bf97)

Description

migration-test sometimes fails with SIGSEGV. This can be more reliably reproduced by running:

meson test -C build qtest-x86_64/migration-test --test-args='-p /x86_64/migration/validate_uuid' --repeat 50 --print-errorlogs

This failure is caused by a use-after-free bug. When QEMU is terminated by SIGTERM during the test, it executes the cleanup function migration_shutdown:

void migration_shutdown(void)
{
    /*
     * When the QEMU main thread exit, the COLO thread
     * may wait a semaphore. So, we should wakeup the
     * COLO thread before migration shutdown.
     */
    colo_shutdown();
    /*
     * Cancel the current migration - that will (eventually)
     * stop the migration using this structure
     */
    migration_cancel(NULL);
    object_unref(OBJECT(current_migration)); // <----- BUG

    /*
     * Cancel outgoing migration of dirty bitmaps. It should
     * at least unref used block nodes.
     */
    dirty_bitmap_mig_cancel_outgoing();

    /*
     * Cancel incoming migration of dirty bitmaps. Dirty bitmaps
     * are non-critical data, and their loss never considered as
     * something serious.
     */
    dirty_bitmap_mig_cancel_incoming();
}

The problem is that object_unref is called but the object continues to be accessible by calling migrate_get_current. migrate_get_current can then return a pointer to a deallocated object. At this point, there can still be unfinished coroutines that will use this object, as happens in the failing test.