Ci::PipelineArtifacts::ExpireArtifactsWorker hitting statement timeout
We see in https://log.gprd.gitlab.net/goto/23c75f90-2bbf-11ed-8656-f5f2137823ba that Ci::PipelineArtifacts::ExpireArtifactsWorker
is hitting a statement timeout. For example:
SELECT "ci_pipeline_artifacts".* FROM "ci_pipeline_artifacts" INNER JOIN "ci_pipelines" ON "ci_pipelines"."id" = "ci_pipeline_artifacts"."pipeline_id" WHERE "ci_pipelines"."locked" = 0 AND "ci_pipeline_artifacts"."expire_at" < '2022-09-03 14:08:04.682420' LIMIT 100
The query plan looks a bit ugly with a nested loop over one large table (ci_pipeline_artifacts
) and one enormous one (ci_pipelines
):
gitlabhq_production=# explain (analyze, buffers) SELECT "ci_pipeline_artifacts".* FROM "ci_pipeline_artifacts" INNER JOIN "ci_pipelines" ON "ci_pipelines"."id" = "ci_pipeline_artifacts"."pipeline_id" WHERE "ci_pipelines"."locked" = 0 AND "ci_pipeline_artifacts"."expire_at" < '2022-09-03 14:08:04.682420' LIMIT 100;
Limit (cost=1.00..540.07 rows=100 width=169) (actual time=301019.448..304370.825 rows=1 loops=1)
Buffers: shared hit=3921598 read=1262175
I/O Timings: read=295778.275
-> Nested Loop (cost=1.00..2973129.73 rows=551532 width=169) (actual time=301019.447..304370.823 rows=1 loops=1)
Buffers: shared hit=3921598 read=1262175
I/O Timings: read=295778.275
-> Index Scan using index_ci_pipeline_artifacts_on_expire_at on ci_pipeline_artifacts (cost=0.43..148004.78 rows=861831 width=169) (actual time=2.472..7273.815 rows=861384 loops=1)
Index Cond: (expire_at < '2022-09-03 14:08:04.68242+00'::timestamp with time zone)
Buffers: shared hit=798784 read=77959
I/O Timings: read=5648.962
-> Index Scan using ci_pipelines_pkey on ci_pipelines (cost=0.57..3.28 rows=1 width=4) (actual time=0.344..0.344 rows=0 loops=861384)
Index Cond: (id = ci_pipeline_artifacts.pipeline_id)
Filter: (locked = 0)
Rows Removed by Filter: 1
Buffers: shared hit=3122814 read=1184216
I/O Timings: read=290129.313
Planning Time: 3.915 ms
Execution Time: 304370.874 ms
Edited by Stan Hu