Skip to content

Reorder primary key columns for ci_runner_machine_builds - try 2

Marius Bobin requested to merge 397014-reorder-pk-columns into master

What does this MR do and why?

This is the re-installment of !114894 (merged) which was reverted in !115592 (merged)

There were two problems with the first approach:

  1. The migration was executed while a long vacuum process was running:

image

source

The migration was executed at different times(1st column) and all attempts were blocked by the same process(3rd column). Process 3906324 was executing vacuum to prevent wraparound on ci_builds and it took around 10 hours to complete:

image

We don't have any solution at this point for this problem, maybe just interrupting the vacuum process which is a safe operation: gitlab-com/gl-infra/production#8588 (comment 1329705387) We're discussing more options in !115874 (comment 1332504579)

  1. ShareRowExclusiveLock on ci_builds is hard to get due to all the activity that's going on. This is why the migration deadlocked on one of the executions.

image

source

To solve this we can get an access exclusive lock on both tables at the beginning of the execution. We've used this technique to add the FK between ci_builds and p_ci_builds_metadata: !114363 (merged)

Data deletion

This MR deletes the data from p_ci_runner_machine_builds partition. We need to delete because it makes the columns reordering easier.

This is happening only on .com because this feature is still under a feature flag that is disabled by default. And it's okay to remove it because it's runners related metadata, not user facing, and it doesn't affect the jobs execution.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #397014 (closed)

Edited by Marius Bobin

Merge request reports