Crash in virSecurityManagerTransactionStart - rare, but seemingly happening for years
Software environment
- Operating system: Ubuntu
- Architecture: x86
- kernel version: 5.4.0-80-generic
- libvirt version: 5.0 - 6.0
- Hypervisor and version: qemu 4.2
Description of problem
I found enough crash reports in virSecurityManagerTransactionStart to be concerned. They are on:
- https://errors.ubuntu.com/problem/227a7f4d7b698c340f928038ad78fcaf4bb2de5a
- https://errors.ubuntu.com/problem/49e76f2070b85b5bb1ef09b837c90f9b2fb8bf47 But that might be blocked to not expose too much. Throughout the years I've got 809 such crashes, but never a bug report nor a reproducer.
The backtrace always looks like
#0 0x00007f6cb2fb253a in virSecurityManagerTransactionStart (mgr=0x7f6c6c010ff0) at ../../../src/security/security_manager.c:261
ret = 0
#1 0x00007f6ca5546779 in qemuSecurityRestoreAllLabel (driver=driver@entry=0x7f6c6c00f910, vm=vm@entry=0x7f6c800030e0, migrated=migrated@entry=false) at ../../../src/qemu/qemu_security.c:80
priv = 0x7f6c800031b0
transactionStarted = false
__func__ = "qemuSecurityRestoreAllLabel"
#2 0x00007f6ca54d85e5 in qemuProcessStop (driver=driver@entry=0x7f6c6c00f910, vm=vm@entry=0x7f6c800030e0, reason=reason@entry=VIR_DOMAIN_SHUTOFF_SHUTDOWN, asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_NONE, flags=flags@entry=0) at ../../../src/qemu/qemu_process.c:7398
ret = <optimized out>
retries = 0
priv = 0x7f6c800031b0
orig_err = 0x7f6c5c12fca0
def = 0x7f6ca001bb80
vport = 0x0
i = <optimized out>
timestamp = 0x7f6c5c126b40 "2021-05-10 19:52:10.390+0000"
cfg = 0x7f6c6c010280
conn = 0x0
__func__ = "qemuProcessStop"
#3 0x00007f6ca55442c9 in processMonitorEOFEvent (vm=0x7f6c800030e0, driver=0x7f6c6c00f910) at ../../../src/qemu/qemu_driver.c:4792
stopReason = <optimized out>
auditReason = <optimized out>
stopFlags = 0
priv = 0x7f6c800031b0
eventReason = <optimized out>
event = 0x7f6c5c129ff0
priv = <optimized out>
eventReason = <optimized out>
stopReason = <optimized out>
auditReason = <optimized out>
stopFlags = <optimized out>
event = <optimized out>
__func__ = "processMonitorEOFEvent"
#4 qemuProcessEventHandler (data=0x56147d16ece0, opaque=0x7f6c6c00f910) at ../../../src/qemu/qemu_driver.c:4898
processEvent = 0x56147d16ece0
vm = 0x7f6c800030e0
driver = 0x7f6c6c00f910
__func__ = "qemuProcessEventHandler"
#5 0x00007f6cb2f0309f in virThreadPoolWorker (opaque=opaque@entry=0x56147d23e2c0) at ../../../src/util/virthreadpool.c:163
data = 0x0
pool = 0x7f6c6c012510
cond = 0x7f6c6c012578
priority = false
curWorkers = 0x7f6c6c0125f0
maxLimit = 0x7f6c6c0125d8
job = 0x56147d17d300
#6 0x00007f6cb2f0240c in virThreadHelper (data=<optimized out>) at ../../../src/util/virthread.c:196
args = 0x0
local = {func = 0x7f6cb2f02fa0 <virThreadPoolWorker>, funcName = 0x7f6ca557f71b "qemuProcessEventHandler", worker = true, opaque = 0x56147d23e2c0}
#7 0x00007f6cb2bc5609 in start_thread (arg=<optimized out>) at pthread_create.c:477
ret = <optimized out>
pd = <optimized out>
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140103791343360, -6553929242546578803, 140728987402702, 140728987402703, 140728987402880, 140103791340480, 6616682297413819021, 6616528284930103949}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
not_first_call = 0
#8 0x00007f6cb2aec293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
No locals.
Now that function didn't change in a while and does not look too complex:
int
virSecurityManagerTransactionStart(virSecurityManagerPtr mgr)
{
int ret = 0;
virObjectLock(mgr);
if (mgr->drv->transactionStart)
ret = mgr->drv->transactionStart(mgr);
virObjectUnlock(mgr);
return ret;
}
virObjectLock checks its references before access. And from the BT we see that mgr isn't NULL.
So the only thing that comes to mind is that maybe mgr->drv
contains bad data and then mgr->drv->transactionStart
breaks it.
I wonder if anyone could explain that condition and also if the following would be a fix that is worth to submit?
--- a/src/security/security_manager.c
+++ b/src/security/security_manager.c
@@ -257,7 +257,7 @@ virSecurityManagerTransactionStart(virSecurityManager *mgr)
int ret = 0;
virObjectLock(mgr);
- if (mgr->drv->transactionStart)
+ if (mgr && mgr->drv && mgr->drv->transactionStart)
ret = mgr->drv->transactionStart(mgr);
virObjectUnlock(mgr);
return ret;
Steps to reproduce
Issue is from automated crash reports, I have no idea how to reproduce it :-/
P.S. I haven't seen it in libvirt >6.0 yet, but didn't find a fix that obviously was responsible for this. So maybe it was already fixed in another place, or OTOH there are not yet enough >6.0 instances running ...