Skip to content

Use AccessExclusive lock for adding FK to post-deploy migration

Stan Hu requested to merge sh-use-access-exclusive-lock into master

What does this MR do and why?

In gitlab-com/gl-infra/production#8521 (closed), we saw that the attempt to add a foreign key from p_ci_builds_metadata to ci_builds failed due to a deadlock. The deadlock appears to happen because while ShareUpdateExclusiveLock allows reads but prevents concurrent modifications to ci_builds, an open transaction trying to upgrade to a RowExclusiveLock will deadlock while the ALTER TABLE attempts to grab an AccessExclusiveLock. This can happen on a busy system like GitLab.com.

For example, we saw:

  1. PipelineProcessWorker starts a new transaction and reads some data from ci_builds.
  2. The migration calls LOCK TABLE...IN SHARE ROW EXCLUSIVE MODE, which prevents other transactions from updating rows in ci_builds while allowing reads.
  3. The migration calls ALTER TABLE and tries to get a AccessExclusiveLock. It has to wait for the transaction in 1 to finish.
  4. PipelineProcessWorker attempts to acquire a RowExclusiveLock, but it can't because of step 2. 💥

AccessExclusiveLock is the heaviest lock that prevents any read or writes from the table. In this case, we need to use this to ensure we have complete control of the tables and prevent read transactions from starting in the first place.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Stan Hu

Merge request reports