Add foreign key constraint to partitions of ci_runner_machines_687967fa8a
note:
What does this MR do and why?
This MR recreates a foreign key constraint in ci_runner_machines_687967fa8a (follow-up issue for non-.com), which was removed following an S2 incident on .com. The incident happened because the FK was added before !166520 (merged) had been merged to complete the backfill of the target ci_runners_e59bb2812d table. The backfill in production completed on Wed 6th, so we can now add back the FK. We can't verify the FK at the same time, since even with the following integrity guarantees:
- a
fk_rails_666b61f04fconstraint inci_runner_machineswhich makes sure that records point to a validci_runnersrecord. - a
table_sync_trigger_bc3e7b56bdtrigger inci_runner_machineswhich syncs any changes fromci_runner_machinesto the newci_runner_machines_687967fa8apartitioned table.
there are 265 runner managers referencing 260 missing runners (from the new ci_runner_machines_687967fa8a table to ci_runners_e59bb2812d), even though those runners are present on ci_runners:
This MR also adds logic to ensure a copy of the runner exists in the sharded table whenever a runner manager is created, to avoid a relapse of the incident.
Changelog: added
References
Please include cross links to any resources that are relevant to this MR This will give reviewers and future readers helpful context to give an efficient review of the changes introduced.
- Recreate foreign key constraints from ci_runner... (#502403 - closed)
- 2024-10-31: Runner verification API returning 500 (gitlab-com/gl-infra/production#18792 - closed)
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Screenshots or screen recordings
Screenshots are required for UI changes, and strongly recommended for all other merge requests.
Registering a runner manager on an existing runner
-
taking a newly created runner (never contacted):
-
ensuring the runner does not exist on the partitioned table:
-
calling the
POST /runners/verifyendpoint on it: -
confirm that the runner manager appears correctly on GitLab:
-
the partitioned table runner was created:
Updating an existing runner manager that doesn't have its runner on ci_runners_e59bb2812d
-
we'll start by delete the runner from the partitioned table, to emulate an online runner which doesn't have it's record on the partitioned table, but gets contacted and needs to update the
ci_runner_machinestable: -
the
ci_runner_machines.contacted_atfield gets updated, but there's no equivalent record is created inci_runner_machines_687967fa8a(as I believe the sync trigger only updates records if they exist) and therefore, the FK constraint is not checked. As expected, a correspondingci_runners_e59bb2812ddoesn't get created:
How to set up and validate locally
Numbered steps to set up and validate the change are strongly suggested.
n/a









