Add nullify job for orphan runner_id columns of ci_builds
What does this MR do and why?
When a runner gets deleted, jobs that were run with it still have runner_id
. This MR adds a background migration job to nullify old orphan runner_id
columns of ci_builds
.
There will be 3 steps (https://docs.gitlab.com/ee/development/database/add_foreign_key_to_existing_column.html);
- Add a NOT VALID foreign key constraint to the column to ensure GitLab doesn't create inconsistent records. <-- !80203 (merged)
- Add a data migration, to fix or clean up existing records. <-- You're here
- Validate the whole table by making the foreign key VALID. <-- will be in 14.10
This is the 2nd step.
Related comment: #30159 (comment 836146520)
Database
Up
== 20220223112304 ScheduleNullifyOrphanRunnerIdOnCiBuilds: migrating ==========
== 20220223112304 ScheduleNullifyOrphanRunnerIdOnCiBuilds: migrated (0.0513s) =
Down
== 20220223112304 ScheduleNullifyOrphanRunnerIdOnCiBuilds: reverting ==========
== 20220223112304 ScheduleNullifyOrphanRunnerIdOnCiBuilds: reverted (0.0000s) =
Update Query
UPDATE "ci_builds"
SET "runner_id" = NULL, "lock_version" = COALESCE("lock_version", 0) + 1
WHERE "ci_builds"."id" IN (
SELECT "ci_builds"."id"
FROM "ci_builds"
LEFT OUTER JOIN ci_runners ON ci_runners.id = ci_builds.runner_id
WHERE (ci_builds.runner_id IS NOT NULL AND ci_runners.id IS NULL)
AND "ci_builds"."id" BETWEEN 1 AND 1000
)
Total Time
- We have ~2B
ci_builds
rows. - With a 100k batch size, we'll have 20k batches.
- (20k * 2 min interval) / 60 min = ~666 hours.
- 666 hours / 24 = ~27 days.
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Related to #30159 (closed)
Edited by Furkan Ayhan