Skip to content

Handle sharding_key_id pointing to missing projects with regards to project-owned runners

Once #511343 (closed) is deployed on .com, it is important to make sure that any existing runners, runner managers, and runner taggings have a valid sharding_key_id. Any records that don't have an alternative owner project should be deleted.

 info    Data state at: 2025-01-08 07:57:57 UTC
Server: PostgreSQL 16.4 (Debian 16.4-1.pgdg110+2)
Version: 4.1.0
Home: http://pgcli.com
gitlabhq_dblab> SELECT COUNT(*) FROM ci_runners WHERE runner_type = 3 AND sharding_key_id IS NOT NULL AND NOT EXISTS (SELECT 1 FROM ci_runner_projects WHERE ci_runners.sharding_key_id = ci_runner_projects.project_id AND ci_runners.id = ci_runner_projects.runner_id)
+-------+
| count |
|-------|
| 7963  |
+-------+
SELECT 1
Time: 41.322s (41 seconds), executed in: 41.318s (41 seconds)
gitlabhq_dblab> SELECT COUNT(*) FROM ci_runner_machines WHERE runner_type = 3 AND sharding_key_id IS NOT NULL AND NOT EXISTS (SELECT 1 FROM ci_runner_projects WHERE ci_runner_machines.sharding_key_id = ci_runner_projects.project_id AND ci_runner_machines.runner_id = ci_runner_proj
 ects.runner_id)
+-------+
| count |
|-------|
| 1187  |
+-------+
SELECT 1
Time: 14.279s (14 seconds), executed in: 14.277s (14 seconds)
gitlabhq_dblab> SELECT COUNT(*) FROM ci_runner_taggings WHERE runner_type = 3 AND sharding_key_id IS NOT NULL AND NOT EXISTS (SELECT 1 FROM ci_runner_projects WHERE ci_runner_taggings.sharding_key_id = ci_runner_projects.project_id AND ci_runner_taggings.runner_id = ci_runner_proj
 ects.runner_id)
+-------+
| count |
|-------|
| 4522  |
+-------+
SELECT 1
Time: 3.381s (3 seconds), executed in: 3.379s (3 seconds)

Proposal

For this, we need to write a batched background migration that will iterate project runners that don't have a matching ci_runners_projects record but do have a fallback ci_runner_projects record, and assign the project_id of the oldest ci_runner_projects record for that runner id.

UPDATE
   "ci_runners"
 SET sharding_key_id = (
     SELECT project_id
     FROM ci_runner_projects
     WHERE ci_runners.id = ci_runner_projects.runner_id
     ORDER BY id ASC
     LIMIT 1)
 WHERE runner_type = 3
   AND sharding_key_id IS NOT NULL
   AND NOT EXISTS (
     SELECT 1
      FROM ci_runner_projects
      WHERE ci_runners.sharding_key_id = ci_runner_projects.project_id
        AND ci_runners.id = ci_runner_projects.runner_id)
   AND EXISTS (
     SELECT 1
     FROM ci_runner_projects
     WHERE ci_runners.id = ci_runner_projects.runner_id)