Skip to content

Destroy unlocked and expired pipeline artifacts

Stan Hu requested to merge sh-destroy-unlocked-pipeline-artifacts into master

What does this MR do and why?

Currently the SQL query for finding expired CI pipeline artifacts (not to be confused with CI job artifacts) can time out since it may have to perform a nested loop join between ci_pipeline_artifacts with ci_pipelines.

In !97194 (merged), we added a locked column to speed up the finding of expirable artifacts. When a pipeline is unlocked, it will also unlock the associated pipeline artifacts (!97228 (merged)).

This commit takes advantage of this denormalization by modifying Ci::PipelineArtifacts::DestroyAllExpiredService to destroy unlocked and expired artifacts with the help of this locked column. Deleting these artifacts via this fast query can be enabled via the ci_destroy_unlocked_pipeline_artifacts feature flag.

The legacy query runs afterwards to ensure that pipeline artifacts with the denormalized locked column get a chance to be deleted.

Relates to #372500 (closed)

SQL queries

Legacy queries

SELECT "ci_pipeline_artifacts".* FROM "ci_pipeline_artifacts" INNER JOIN "ci_pipelines" ON "ci_pipelines"."id" = "ci_pipeline_artifacts"."pipeline_id" WHERE "ci_pipelines"."locked" = 0 AND "ci_pipeline_artifacts"."expire_at" < '2022-09-03 14:08:04.682420' LIMIT 100
gitlabhq_production=# explain (analyze, buffers) SELECT "ci_pipeline_artifacts".* FROM "ci_pipeline_artifacts" INNER JOIN "ci_pipelines" ON "ci_pipelines"."id" = "ci_pipeline_artifacts"."pipeline_id" WHERE "ci_pipelines"."locked" = 0 AND "ci_pipeline_artifacts"."expire_at" < '2022-09-03 14:08:04.682420' LIMIT 100;
 Limit  (cost=1.00..540.07 rows=100 width=169) (actual time=301019.448..304370.825 rows=1 loops=1)
   Buffers: shared hit=3921598 read=1262175
   I/O Timings: read=295778.275
   ->  Nested Loop  (cost=1.00..2973129.73 rows=551532 width=169) (actual time=301019.447..304370.823 rows=1 loops=1)
         Buffers: shared hit=3921598 read=1262175
         I/O Timings: read=295778.275
         ->  Index Scan using index_ci_pipeline_artifacts_on_expire_at on ci_pipeline_artifacts  (cost=0.43..148004.78 rows=861831 width=169) (actual time=2.472..7273.815 rows=861384 loops=1)
               Index Cond: (expire_at < '2022-09-03 14:08:04.68242+00'::timestamp with time zone)
               Buffers: shared hit=798784 read=77959
               I/O Timings: read=5648.962
         ->  Index Scan using ci_pipelines_pkey on ci_pipelines  (cost=0.57..3.28 rows=1 width=4) (actual time=0.344..0.344 rows=0 loops=861384)
               Index Cond: (id = ci_pipeline_artifacts.pipeline_id)
               Filter: (locked = 0)
               Rows Removed by Filter: 1
               Buffers: shared hit=3122814 read=1184216
               I/O Timings: read=290129.313
 Planning Time: 3.915 ms
 Execution Time: 304370.874 ms

New queries

SELECT "ci_pipeline_artifacts".* FROM "ci_pipeline_artifacts" WHERE "ci_pipeline_artifacts"."expire_at" < '2022-09-22 17:40:55.175054' AND "ci_pipeline_artifacts"."locked" = 0 LIMIT 100
                                                                                       QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.42..35.25 rows=100 width=171) (actual time=0.023..0.527 rows=100 loops=1)
   Buffers: shared hit=452
   ->  Index Scan using ci_pipeline_artifacts_on_expire_at_for_removal on ci_pipeline_artifacts  (cost=0.42..62604.77 rows=179750 width=171) (actual time=0.021..0.515 rows=100 loops=1)
         Index Cond: (expire_at < '2022-09-22 17:40:55.175054+00'::timestamp with time zone)
         Buffers: shared hit=452
 Planning Time: 0.171 ms
 Execution Time: 0.558 ms
(7 rows)

How to set up and validate locally

  1. Set up a project with test coverage: https://docs.gitlab.com/ee/ci/testing/test_coverage_visualization.html
  2. Run a few pipelines.
  3. In gdk psql, check that SELECT COUNT(*) FROM ci_artifacts_pipelines.
  4. Manually run Ci::PipelineArtifacts::ExpireArtifactsWorker.new.perform.
  5. Repeat step 3. See that the artifacts have been pruned.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Stan Hu

Merge request reports