GitLab Monitor CI Runner query takes over 1 second to run in production
Every minute, GitLab Monitor issues a CI runner query. On GitLab.com, each of these queries takes over 1 second to execute, leading to a slowlog entry being issued each time.
Examples are shown in this spreadsheet: https://docs.google.com/spreadsheets/d/1GfH77HY9PJyXudxjKCdPYADmdUced3fNv4cSfQ_JgC8/edit#gid=420932086
The query issued is:
SELECT
ci_builds.runner_id,
ci_runners.is_shared,
projects.namespace_id,
projects.mirror,
projects.mirror_trigger_builds,
ci_pipelines.pipeline_schedule_id,
ci_builds.trigger_request_id,
(COALESCE(namespaces.shared_runners_minutes_limit, application_settings.shared_runners_minutes, 0) = 0 OR
COALESCE(namespace_statistics.shared_runners_seconds, 0) < COALESCE(namespaces.shared_runners_minutes_limit, application_settings.shared_runners_minutes, 0) * 60) as has_minutes,
COUNT(*) AS count
FROM ci_builds
JOIN ci_runners
ON ci_runners.id = ci_builds.runner_id
JOIN projects
ON projects.id = ci_builds.project_id
JOIN ci_pipelines
ON ci_pipelines.id = ci_builds.commit_id
JOIN namespaces
ON namespaces.id = projects.namespace_id
LEFT JOIN namespace_statistics
ON namespace_statistics.namespace_id = namespaces.id
JOIN application_settings
ON application_settings.id = 1
WHERE ci_builds.type = 'Ci::Build'
AND ci_builds.status = 'running'
AND projects.pending_delete = 'f'
GROUP BY
ci_builds.runner_id,
ci_runners.is_shared,
projects.namespace_id,
projects.mirror,
projects.mirror_trigger_builds,
ci_pipelines.pipeline_schedule_id,
ci_builds.trigger_request_id,
namespaces.shared_runners_minutes_limit,
namespace_statistics.shared_runners_seconds,
application_settings.shared_runners_minutes
The explain plan on gitlab.com is as follows:
GroupAggregate (cost=497654.14..500288.06 rows=55451 width=40)
Group Key: ci_builds.runner_id, ci_runners.is_shared, projects.namespace_id, projects.mirror, projects.mirror_trigger_builds, ci_pipelines.pipeline_schedule_id, ci_builds.trigger_request_id, namespaces.shared_runners_minutes_limit, namespace_statistics.shared_runners_seconds, application_settings.shared_runners_minutes
-> Sort (cost=497654.14..497792.77 rows=55451 width=31)
Sort Key: ci_builds.runner_id, ci_runners.is_shared, projects.namespace_id, projects.mirror, projects.mirror_trigger_builds, ci_pipelines.pipeline_schedule_id, ci_builds.trigger_request_id, namespaces.shared_runners_minutes_limit, namespace_statistics.shared_runners_seconds, application_settings.shared_runners_minutes
-> Merge Join (cost=35.86..493284.90 rows=55451 width=31)
Merge Cond: (ci_builds.runner_id = ci_runners.id)
-> Nested Loop Left Join (cost=2.55..673730.34 rows=81970 width=30)
-> Nested Loop (cost=2.13..636717.25 rows=81970 width=30)
-> Nested Loop (cost=1.70..597579.94 rows=81970 width=22)
-> Nested Loop (cost=1.13..320367.65 rows=81970 width=22)
-> Nested Loop (cost=0.70..117107.14 rows=81973 width=20)
-> Index Scan using index_ci_builds_on_status_and_type_and_runner_id on ci_builds (cost=0.57..116079.33 rows=81973 width=16)
Index Cond: (((status)::text = 'running'::text) AND ((type)::text = 'Ci::Build'::text))
-> Materialize (cost=0.12..3.15 rows=1 width=4)
-> Index Scan using application_settings_pkey on application_settings (cost=0.12..3.14 rows=1 width=4)
Index Cond: (id = 1)
-> Index Scan using projects_pkey on projects (cost=0.43..2.47 rows=1 width=10)
Index Cond: (id = ci_builds.project_id)
Filter: (NOT pending_delete)
-> Index Scan using ci_pipelines_pkey on ci_pipelines (cost=0.57..3.37 rows=1 width=8)
Index Cond: (id = ci_builds.commit_id)
-> Index Scan using namespaces_pkey on namespaces (cost=0.43..0.47 rows=1 width=8)
Index Cond: (id = projects.namespace_id)
-> Index Scan using index_namespace_statistics_on_namespace_id on namespace_statistics (cost=0.42..0.44 rows=1 width=8)
Index Cond: (namespace_id = namespaces.id)
-> Index Scan using ci_runners_pkey on ci_runners (cost=0.42..35968.71 rows=343066 width=5)
(26 rows)
Edited by Andreas Brandl