Update unknown ci_pipeline_artifacts.locked column values
What does this MR do and why?
Currently the SQL query for finding expired CI pipeline artifacts (not
to be confused with CI job artifacts) can time out since it may have
to perform a nested loop join between ci_pipeline_artifacts
with
ci_pipelines
.
In !97194 (merged), we
added a locked
column to speed up the finding of expirable
artifacts. When a pipeline is unlocked, it will also unlock the
associated pipeline artifacts
(!97228 (merged)).
However, on GitLab.com, there are still 807,122
ci_pipeline_artifacts
rows that have an unknown lock status. This
commit adds a batched background migration that will update these
values according to their associated ci_pipelines.locked
value.
Once this data is updated, !98633 (merged) will take advantage of this. We can then simplify the deletion of CI pipeline artifacts by removing the expensive query.
Relates to #372500 (closed)
How to set up and validate locally
- Insert some pipelines with various
locked
states:
INSERT INTO ci_pipelines (locked) VALUES (1);
INSERT INTO ci_pipelines (locked) VALUES (0);
- Find the
id
:
SELECT max(id) FROM ci_pipelines;
-[ RECORD 1 ]
max | 347
- Insert the
ci_pipeline_artifacts
to thisid
(347) andid - 1
(346):
INSERT INTO ci_pipeline_artifacts (created_at, updated_at, pipeline_id, project_id, size, file_store, file_type, file_format, file) VALUES ('2022-09-28', '2022-09-28', 346, 1, 1024, 1, 1, 1, 'test.gz');
INSERT INTO ci_pipeline_artifacts (created_at, updated_at, pipeline_id, project_id, size, file_store, file_type, file_format, file) VALUES ('2022-09-28', '2022-09-28', 346, 1, 1024, 1, 2, 1, 'test.gz');
INSERT INTO ci_pipeline_artifacts (created_at, updated_at, pipeline_id, project_id, size, file_store, file_type, file_format, file) VALUES ('2022-09-28', '2022-09-28', 346, 1, 1024, 1, 3, 1, 'test.gz');
INSERT INTO ci_pipeline_artifacts (created_at, updated_at, pipeline_id, project_id, size, file_store, file_type, file_format, file) VALUES ('2022-09-28', '2022-09-28', 347, 1, 1024, 1, 1, 1, 'test.gz');
INSERT INTO ci_pipeline_artifacts (created_at, updated_at, pipeline_id, project_id, size, file_store, file_type, file_format, file) VALUES ('2022-09-28', '2022-09-28', 347, 1, 1024, 1, 2, 1, 'test.gz');
INSERT INTO ci_pipeline_artifacts (created_at, updated_at, pipeline_id, project_id, size, file_store, file_type, file_format, file) VALUES ('2022-09-28', '2022-09-28', 347, 1, 1024, 1, 3, 1, 'test.gz');
- Run:
bundle exec rails db:migrate
- In
log/application.log
, you should be able to confirm that the migrations went through:
2022-09-29T06:40:12.573Z: {:message=>"BatchedJob transition", :batched_job_id=>21, :previous_state=>:pending, :new_state=>:running, :batched_migration_id=>21, :job_class_name=>"UpdateCiPipelineArtifactsUnknownLockedStatus", :job_arguments=>[], :exception_class=>nil, :exception_message=>nil}
2022-09-29T06:40:12.603Z: {:message=>"BatchedJob transition", :batched_job_id=>21, :previous_state=>:running, :new_state=>:succeeded, :batched_migration_id=>21, :job_class_name=>"UpdateCiPipelineArtifactsUnknownLockedStatus", :job_arguments=>[], :exception_class=>nil, :exception_message=>nil}
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.