Skip to content

Fix Subscriptions deadlock

Mikhail Mazurskiy requested to merge ash2k/sub-deadlock into master

Dispatch() wants to send on the channel and blocks forever, holding the mutex:

runtime.gopark(proc.go:364)
runtime.selectgo(select.go:328)
gitlab.com/gitlab-org/cluster-integration/gitlab-agent/v16/internal/tool/syncz.(*Subscriptions[go.shape.int_0]).Dispatch(subscriptions.go:68)
gitlab.com/gitlab-org/cluster-integration/gitlab-agent/v16/internal/tool/syncz.TestSubscriptions_ConcurrentCancel.func2(subscriptions_test.go:99)
k8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1(wait.go:75)
runtime.goexit(asm_arm64.s:1172)
k8s.io/apimachinery/pkg/util/wait.(*Group).Start(wait.go:73)

On() stopped receiving from the channel, but hasn't removed it from the list yet. On() waits for the mutex forever:

runtime.gopark(proc.go:364)
runtime.goparkunlock(proc.go:369)
runtime.semacquire1(sema.go:150)
sync.runtime_SemacquireMutex(sema.go:77)
sync.(*Mutex).lockSlow(mutex.go:171)
sync.(*Mutex).Lock(mutex.go:90)
gitlab.com/gitlab-org/cluster-integration/gitlab-agent/v16/internal/tool/syncz.(*Subscriptions[go.shape.int_0]).remove(subscriptions.go:27)
gitlab.com/gitlab-org/cluster-integration/gitlab-agent/v16/internal/tool/syncz.(*Subscriptions[go.shape.int_0]).On.func1(subscriptions.go:48)
runtime.deferreturn(panic.go:476)
gitlab.com/gitlab-org/cluster-integration/gitlab-agent/v16/internal/tool/syncz.(*Subscriptions[go.shape.int_0]).On(subscriptions.go:53)
gitlab.com/gitlab-org/cluster-integration/gitlab-agent/v16/internal/tool/syncz.TestSubscriptions_ConcurrentCancel.func1(subscriptions_test.go:96)
k8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1(wait.go:75)
runtime.goexit(asm_arm64.s:1172)
k8s.io/apimachinery/pkg/util/wait.(*Group).Start(wait.go:73)

Found while working on gitlab-org/gitlab#415632 (closed).

Edited by Mikhail Mazurskiy

Merge request reports