GroupDestroyWorker failure - ci_builds deadlock
Summary
Related to &7171
On Gitlab.com, GroupDestroyWorker
sometimes fails due to a non-deleteable project. The failures are tracked in this Kibana dashboard.
We have narrowed down the failures to ten distinct Project#delete_error
values. This issue deals with project deletion errors due to
PG::TRDeadlockDetected: ERROR: deadlock detected
DETAIL: Process 74533 waits for ShareLock on transaction 3115395357; blocked by process 93598.
Process 93598 waits for ShareLock on transaction 3115395356; blocked by process 74533.
HINT: See server log for query details.
CONTEXT: while deleting tuple (43378125,4) in relation "ci_builds"
SQL statement "DELETE FROM ONLY "public"."ci_builds" WHERE $1 OPERATOR(pg_catalog.=) "commit_id""
and
PG::TRDeadlockDetected: ERROR: deadlock detected
DETAIL: Process 74074 waits for ShareLock on transaction 2991120241; blocked by process 15673.
Process 15673 waits for ShareLock on transaction 2991120242; blocked by process 74074.
HINT: See server log for query details.
CONTEXT: while deleting tuple (214006391,1) in relation "ci_builds"
SQL statement "DELETE FROM ONLY "public"."ci_builds" WHERE $1 OPERATOR(pg_catalog.=) "commit_id""
The full list of delete_errors
can be found here: https://gitlab.com/gitlab-org/gitlab/-/issues/342692#note_737332055
Impact
In the past week, GroupDestroyWorker
has failed ~1500 times due to 138 projects attempting to be deleted over and over again.
Recommendation
Verification
Edited by Serena Fang