Test fails with SIGSEGV because of use-after-free
Host environment
- Operating system: linux
- Architecture: x86
- QEMU flavor: qemu-system-x86_64
- QEMU version: 8.1.50 (commit d762bf97)
Description
migration-test
sometimes fails with SIGSEGV
. This can be more reliably reproduced by running:
meson test -C build qtest-x86_64/migration-test --test-args='-p /x86_64/migration/validate_uuid' --repeat 50 --print-errorlogs
This failure is caused by a use-after-free bug. When QEMU is terminated by SIGTERM during the test, it executes the cleanup function migration_shutdown
:
void migration_shutdown(void)
{
/*
* When the QEMU main thread exit, the COLO thread
* may wait a semaphore. So, we should wakeup the
* COLO thread before migration shutdown.
*/
colo_shutdown();
/*
* Cancel the current migration - that will (eventually)
* stop the migration using this structure
*/
migration_cancel(NULL);
object_unref(OBJECT(current_migration)); // <----- BUG
/*
* Cancel outgoing migration of dirty bitmaps. It should
* at least unref used block nodes.
*/
dirty_bitmap_mig_cancel_outgoing();
/*
* Cancel incoming migration of dirty bitmaps. Dirty bitmaps
* are non-critical data, and their loss never considered as
* something serious.
*/
dirty_bitmap_mig_cancel_incoming();
}
The problem is that object_unref
is called but the object continues to be accessible by calling migrate_get_current
. migrate_get_current
can then return a pointer to a deallocated object. At this point, there can still be unfinished coroutines that will use this object, as happens in the failing test.