Make sure project_feature records always exist
What does this MR do?
This MR adds a background migration that checks existing projects and creates project_features
records if they are missing. For GitLab.com, this is currently the case for 3 projects only. The related !24588 (merged) adds a validation for the project -> project_features
association.
Issue with background: #34367 (closed)
Dependency for the migration log file: gitlab-cookbooks/gitlab_fluentd#2 (closed)
Background Migration on GitLab.com
- Jobs scheduled: 240 for a total of about 12M projects
- Total duration: 8 hours
- Affected records that are going to see an update: 3
Sadly, the anti-join projects without project_features
takes too long to execute and hence we can't directly find those 3 records and update them. Instead, we iterate all projects in batches.
Query plans
For batch size 50_000:
- Example for no records to insert: 140ms https://explain.depesz.com/s/mQab8
- Example for inserting one record for each project in range (about 35k out of 50k range): 2,800ms https://explain.depesz.com/s/g8J
(from my test instance)
WITH created_records AS (
INSERT INTO project_features (project_id, merge_requests_access_level, issues_access_level, wiki_access_level, snippets_access_level, builds_access_level, repository_access_level, pages_access_level, created_at, updated_at)
SELECT
projects.id,
20,
20,
20,
20,
20,
20,
20,
NOW(),
NOW()
FROM
projects
WHERE
projects.id BETWEEN 2100000 AND 2150000
AND NOT EXISTS (
SELECT
1
FROM
project_features
WHERE
project_features.project_id = projects.id)
ON CONFLICT (project_id)
DO NOTHING
RETURNING
*
)
SELECT
COUNT(*) AS number_of_created_records
FROM
created_records;
Does this MR meet the acceptance criteria?
Conformity
Edited by 🤖 GitLab Bot 🤖