Pipeline schedules page is really slow on some projects
The pipeline schedules page for this project (which has just five schedules) is very slow and sometimes times out:
That's when it runs this query:
SELECT "ci_pipelines".* FROM "ci_pipelines" WHERE "ci_pipelines"."pipeline_schedule_id" IN (45276, 43210, 43209, 43208, 13682) ORDER BY "ci_pipelines"."id" DESC
Here's an extreme example of a cold query plan for that:
gitlabhq_production=> EXPLAIN (analyze, buffers) SELECT "ci_pipelines".* FROM "ci_pipelines" WHERE "ci_pipelines"."pipeline_schedule_id" IN (45276, 43210, 43209, 43208, 13682) ORDER BY "ci_pipelines"."id" DESC;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=33609.12..33686.73 rows=31042 width=317) (actual time=40528.985..40534.862 rows=36357 loops=1)
Sort Key: id DESC
Sort Method: quicksort Memory: 11194kB
Buffers: shared hit=297 read=36151 written=8287
I/O Timings: read=38759.033 write=1176.065
-> Index Scan using index_ci_pipelines_on_pipeline_schedule_id on ci_pipelines (cost=0.57..31293.09 rows=31042 width=317) (actual time=3.569..40407.189 rows=36357 loops=1)
Index Cond: (pipeline_schedule_id = ANY ('{45276,43210,43209,43208,13682}'::integer[]))
Buffers: shared hit=294 read=36151 written=8287
I/O Timings: read=38759.033 write=1176.065
Planning Time: 99.826 ms
Execution Time: 40537.493 ms
(11 rows)
This query comes from a view: https://gitlab.com/gitlab-org/gitlab/-/blob/bc6d5aa6c1ce257dc4f5793c495f32889b3e6636/app/views/projects/pipeline_schedules/index.html.haml#L15
https://gitlab.com/gitlab-org/quality/triage-ops/-/pipeline_schedules has many more schedules, and is slow, but not as slow as the first project mentioned.
Proposal
In the schedules controller we are trying to preload the latest pipeline for each schedule:
def index
@scope = params[:scope]
@all_schedules = Ci::PipelineSchedulesFinder.new(@project).execute
@schedules = Ci::PipelineSchedulesFinder.new(@project).execute(scope: params[:scope])
.includes(:last_pipeline)
end
And last_pipeline
is defined as has_one :last_pipeline, -> { order(id: :desc) }, class_name: 'Ci::Pipeline'
ActiveRecord loads all the pipelines for the given schedules in memory and figures out what the last pipeline is in Ruby. I think we need a custom preloader that sets the last_pipeline
records using a more efficient SQL query:
SELECT *
FROM ci_pipelines
INNER JOIN (
SELECT distinct FIRST_VALUE(id) OVER (PARTITION BY pipeline_schedule_id order by id desc) AS id
from ci_pipelines
where pipeline_schedule_id in (?)
) last_pipelines on last_pipelines.id = ci_pipelines.id