Skip to content

Update unknown ci_pipeline_artifacts.locked column values

What does this MR do and why?

Currently the SQL query for finding expired CI pipeline artifacts (not to be confused with CI job artifacts) can time out since it may have to perform a nested loop join between ci_pipeline_artifacts with ci_pipelines.

In !97194 (merged), we added a locked column to speed up the finding of expirable artifacts. When a pipeline is unlocked, it will also unlock the associated pipeline artifacts (!97228 (merged)).

However, on GitLab.com, there are still 807,122 ci_pipeline_artifacts rows that have an unknown lock status. This commit adds a batched background migration that will update these values according to their associated ci_pipelines.locked value.

Once this data is updated, !98633 (merged) will take advantage of this. We can then simplify the deletion of CI pipeline artifacts by removing the expensive query.

Relates to #372500 (closed)

How to set up and validate locally

  1. Insert some pipelines with various locked states:
INSERT INTO ci_pipelines (locked) VALUES (1);
INSERT INTO ci_pipelines (locked) VALUES (0);
  1. Find the id:
SELECT max(id) FROM ci_pipelines;
-[ RECORD 1 ]
max | 347
  1. Insert the ci_pipeline_artifacts to this id (347) and id - 1 (346):
INSERT INTO ci_pipeline_artifacts (created_at, updated_at, pipeline_id, project_id, size, file_store, file_type, file_format, file) VALUES ('2022-09-28', '2022-09-28', 346, 1, 1024, 1, 1, 1, 'test.gz');
INSERT INTO ci_pipeline_artifacts (created_at, updated_at, pipeline_id, project_id, size, file_store, file_type, file_format, file) VALUES ('2022-09-28', '2022-09-28', 346, 1, 1024, 1, 2, 1, 'test.gz');
INSERT INTO ci_pipeline_artifacts (created_at, updated_at, pipeline_id, project_id, size, file_store, file_type, file_format, file) VALUES ('2022-09-28', '2022-09-28', 346, 1, 1024, 1, 3, 1, 'test.gz');

INSERT INTO ci_pipeline_artifacts (created_at, updated_at, pipeline_id, project_id, size, file_store, file_type, file_format, file) VALUES ('2022-09-28', '2022-09-28', 347, 1, 1024, 1, 1, 1, 'test.gz');
INSERT INTO ci_pipeline_artifacts (created_at, updated_at, pipeline_id, project_id, size, file_store, file_type, file_format, file) VALUES ('2022-09-28', '2022-09-28', 347, 1, 1024, 1, 2, 1, 'test.gz');
INSERT INTO ci_pipeline_artifacts (created_at, updated_at, pipeline_id, project_id, size, file_store, file_type, file_format, file) VALUES ('2022-09-28', '2022-09-28', 347, 1, 1024, 1, 3, 1, 'test.gz');
  1. Run:
bundle exec rails db:migrate
  1. In log/application.log, you should be able to confirm that the migrations went through:
2022-09-29T06:40:12.573Z: {:message=>"BatchedJob transition", :batched_job_id=>21, :previous_state=>:pending, :new_state=>:running, :batched_migration_id=>21, :job_class_name=>"UpdateCiPipelineArtifactsUnknownLockedStatus", :job_arguments=>[], :exception_class=>nil, :exception_message=>nil}
2022-09-29T06:40:12.603Z: {:message=>"BatchedJob transition", :batched_job_id=>21, :previous_state=>:running, :new_state=>:succeeded, :batched_migration_id=>21, :job_class_name=>"UpdateCiPipelineArtifactsUnknownLockedStatus", :job_arguments=>[], :exception_class=>nil, :exception_message=>nil}

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Stan Hu

Merge request reports