Skip to content

Test fails with SIGSEGV because of use-after-free

Host environment

  • Operating system: linux
  • Architecture: x86
  • QEMU flavor: qemu-system-x86_64
  • QEMU version: 8.1.50 (commit d762bf97)

Description

migration-test sometimes fails with SIGSEGV. This can be more reliably reproduced by running:

meson test -C build qtest-x86_64/migration-test --test-args='-p /x86_64/migration/validate_uuid' --repeat 50 --print-errorlogs

This failure is caused by a use-after-free bug. When QEMU is terminated by SIGTERM during the test, it executes the cleanup function migration_shutdown:

void migration_shutdown(void)
{
    /*
     * When the QEMU main thread exit, the COLO thread
     * may wait a semaphore. So, we should wakeup the
     * COLO thread before migration shutdown.
     */
    colo_shutdown();
    /*
     * Cancel the current migration - that will (eventually)
     * stop the migration using this structure
     */
    migration_cancel(NULL);
    object_unref(OBJECT(current_migration)); // <----- BUG

    /*
     * Cancel outgoing migration of dirty bitmaps. It should
     * at least unref used block nodes.
     */
    dirty_bitmap_mig_cancel_outgoing();

    /*
     * Cancel incoming migration of dirty bitmaps. Dirty bitmaps
     * are non-critical data, and their loss never considered as
     * something serious.
     */
    dirty_bitmap_mig_cancel_incoming();
}

The problem is that object_unref is called but the object continues to be accessible by calling migrate_get_current. migrate_get_current can then return a pointer to a deallocated object. At this point, there can still be unfinished coroutines that will use this object, as happens in the failing test.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information