praefect database throws deadlock warnings sometimes
We are running 3 praefect instances with 3 gitaly servers (all 13.3.1) and the praefect db is sized with 8 Cores ad 16GiB memory. Today we got the following lines in the log of the praefect database and think a deadlock is a bug:
2020-08-27 09:00:44.333 UTC [9791] praefect@praefect_production ERROR: deadlock detected
2020-08-27 09:00:44.333 UTC [9791] praefect@praefect_production DETAIL: Process 9791 waits for ShareLock on transaction 1021240; blocked by process 9785. Process 9785 waits for ShareLock on transaction 1021247; blocked by process 9791. Process 9791: UPDATE replication_queue_job_lock SET triggered_at = NOW() AT TIME ZONE 'UTC' WHERE (job_id, lock_id) IN (SELECT UNNEST($1::BIGINT[]), UNNEST($2::TEXT[])) Process 9785: WITH existing AS ( SELECT id, lock_id, updated_at, job FROM replication_queue WHERE id = ANY($1) AND state = 'in_progress' FOR UPDATE ) , to_release AS ( UPDATE replication_queue AS queue SET state = CASE WHEN state = 'in_progress' THEN $2::REPLICATION_JOB_STATE ELSE (CASE WHEN $2 = 'completed' THEN 'completed' ELSE queue.state END)::REPLICATION_JOB_STATE END, updated_at = CASE WHEN state = 'in_progress' THEN NOW() AT TIME ZONE 'UTC' ELSE (CASE WHEN $2 = 'completed' THEN NOW() AT TIME ZONE 'UTC' ELSE queue.updated_at END) END FROM existing WHERE existing.id = queue.id OR ( queue.state = 'ready' AND queue.created_at < existing.updated_at AND queue.lock_id = existing.lock_id AND queue.job->>'change' = existing.job->>'change' AND queue.job->>'source_node_storage' = existing.job->>'source_node_storage' ) RETURNING queue.id, queue.lock_id ) , removed_job_lock AS ( DELETE FROM replic
2020-08-27 09:00:44.333 UTC [9791] praefect@praefect_production HINT: See server log for query details.
2020-08-27 09:00:44.333 UTC [9791] praefect@praefect_production CONTEXT: while updating tuple (9,41) in relation "replication_queue_job_lock"
2020-08-27 09:00:44.333 UTC [9791] praefect@praefect_production STATEMENT: UPDATE replication_queue_job_lock SET triggered_at = NOW() AT TIME ZONE 'UTC' WHERE (job_id, lock_id) IN (SELECT UNNEST($1::BIGINT[]), UNNEST($2::TEXT[]))