Optimize the database query for pipelines for MRs
What does this MR do?
This MR improves the performance of the query for fetching pipelines for merge requests. Specifically,
-
Pipelines for merge requests doesn't need to be filtered by SHAs. There is an explicit relationship with a merge request through
merge_request_id
FK, so we can fetch the list of MRs by the id. - We fetch both pipelines for merged results and pipelines for merge requests in one query. As it's
ORDER BY pipelines.id
anyway, it doesn't make any difference. - Removing redundant
source
filtering.
This feature change is behind ci_pipelines_for_merge_request_finder_new_cte
feature flag and disabled by default. We can easily fallback to the previous behavior when there is an edge case on the new query.
Before
WITH "shas" AS (
SELECT
"merge_request_diff_commits"."sha"
FROM
"merge_request_diff_commits"
WHERE
"merge_request_diff_commits"."merge_request_diff_id" IN (
SELECT
"merge_request_diffs"."id"
FROM
"merge_request_diffs"
WHERE
"merge_request_diffs"."merge_request_id" = 46282578
ORDER BY
"merge_request_diffs"."id" DESC
LIMIT
100
)
LIMIT
10000
)
SELECT
"ci_pipelines".*
FROM
(
(
SELECT
"ci_pipelines".*
FROM
"ci_pipelines"
INNER JOIN "shas" ON "shas"."sha" = "ci_pipelines"."source_sha"
WHERE
(
"ci_pipelines"."source" IN (1, 2, 3, 4, 5, 6, 7, 8, 10, 11)
OR "ci_pipelines"."source" IS NULL
)
AND "ci_pipelines"."source" = 10
AND "ci_pipelines"."merge_request_id" = 46282578
AND "ci_pipelines"."project_id" IN (14278368, 278964)
)
UNION
(
SELECT
"ci_pipelines".*
FROM
"ci_pipelines"
INNER JOIN "shas" ON encode("shas"."sha", 'hex') = "ci_pipelines"."sha"
WHERE
(
"ci_pipelines"."source" IN (1, 2, 3, 4, 5, 6, 7, 8, 10, 11)
OR "ci_pipelines"."source" IS NULL
)
AND "ci_pipelines"."source" = 10
AND "ci_pipelines"."merge_request_id" = 46282578
AND "ci_pipelines"."project_id" IN (14278368, 278964)
)
UNION
(
SELECT
"ci_pipelines".*
FROM
"ci_pipelines"
INNER JOIN "shas" ON encode("shas"."sha", 'hex') = "ci_pipelines"."sha"
WHERE
"ci_pipelines"."project_id" = 14278368
AND (
"ci_pipelines"."source" IN (1, 2, 3, 4, 5, 6, 7, 8, 10, 11)
OR "ci_pipelines"."source" IS NULL
)
AND (
"ci_pipelines"."source" IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 12, 13)
OR "ci_pipelines"."source" IS NULL
)
AND "ci_pipelines"."ref" = '25486-batch-suggestions'
AND "ci_pipelines"."tag" = FALSE
)
) ci_pipelines
ORDER BY
CASE ci_pipelines.source WHEN (10) THEN 0 ELSE 1 END,
ci_pipelines.id DESC
Performance
Time: 13.019 ms
- planning: 4.729 ms
- execution: 8.290 ms
- I/O read: 0.000 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 889 (~6.90 MiB) from the buffer pool
- reads: 0 from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0
Sort (cost=8024.05..8024.06 rows=3 width=1788) (actual time=7.767..7.788 rows=91 loops=1)
Sort Key: (CASE ci_pipelines.source WHEN 10 THEN 0 ELSE 1 END), ci_pipelines.id DESC
Sort Method: quicksort Memory: 49kB
Buffers: shared hit=889
CTE shas
-> Limit (cost=4.51..7270.79 rows=10000 width=21) (actual time=0.230..3.912 rows=1435 loops=1)
Buffers: shared hit=605
-> Nested Loop (cost=4.51..143905.30 rows=198039 width=21) (actual time=0.229..3.672 rows=1435 loops=1)
Buffers: shared hit=605
-> HashAggregate (cost=3.93..4.05 rows=12 width=4) (actual time=0.202..0.257 rows=92 loops=1)
Group Key: merge_request_diffs.id
Buffers: shared hit=91
-> Limit (cost=0.57..3.78 rows=12 width=4) (actual time=0.041..0.159 rows=92 loops=1)
Buffers: shared hit=91
-> Index Only Scan using index_merge_request_diffs_on_merge_request_id_and_id on public.merge_request_diffs (cost=0.57..3.78 rows=12 width=4) (actual time=0.041..0.145 rows=92 loops=1)
Index Cond: (merge_request_diffs.merge_request_id = 46282578)
Heap Fetches: 5
Buffers: shared hit=91
-> Index Scan using merge_request_diff_commits_pkey on public.merge_request_diff_commits (cost=0.58..11826.74 rows=16503 width=25) (actual time=0.022..0.032 rows=16 loops=92)
Index Cond: (merge_request_diff_commits.merge_request_diff_id = merge_request_diffs.id)
Buffers: shared hit=514
-> Subquery Scan on ci_pipelines (cost=753.16..753.23 rows=3 width=1788) (actual time=7.396..7.653 rows=91 loops=1)
Buffers: shared hit=889
-> HashAggregate (cost=753.16..753.19 rows=3 width=1784) (actual time=7.390..7.603 rows=91 loops=1)
Group Key: ci_pipelines_1.id, ci_pipelines_1.ref, ci_pipelines_1.sha, ci_pipelines_1.before_sha, ci_pipelines_1.created_at, ci_pipelines_1.updated_at, ci_pipelines_1.tag, ci_pipelines_1.yaml_errors, ci_pipelines_1.committed_at, ci_pipelines_1.project_id, ci_pipelines_1.status, ci_pipelines_1.started_at, ci_pipelines_1.finished_at, ci_pipelines_1.duration, ci_pipelines_1.user_id, ci_pipelines_1.lock_version, ci_pipelines_1.auto_canceled_by_id, ci_pipelines_1.pipeline_schedule_id, ci_pipelines_1.source, ci_pipelines_1.protected, ci_pipelines_1.config_source, ci_pipelines_1.failure_reason, ci_pipelines_1.iid, ci_pipelines_1.merge_request_id, ci_pipelines_1.source_sha, ci_pipelines_1.target_sha, ci_pipelines_1.external_pull_request_id, ci_pipelines_1.ci_ref_id, ci_pipelines_1.locked
Buffers: shared hit=889
-> Append (cost=5.87..752.95 rows=3 width=1784) (actual time=0.824..6.705 rows=154 loops=1)
Buffers: shared hit=889
-> Hash Join (cost=5.87..243.38 rows=1 width=318) (actual time=0.542..0.545 rows=0 loops=1)
Hash Cond: (shas.sha = ci_pipelines_1.source_sha)
Buffers: shared hit=190
-> CTE Scan on shas (cost=0.00..200.00 rows=10000 width=32) (actual time=0.231..0.232 rows=1 loops=1)
Buffers: shared hit=96
-> Hash (cost=5.86..5.86 rows=1 width=318) (actual time=0.291..0.293 rows=0 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 8kB
Buffers: shared hit=94
-> Index Scan using index_ci_pipelines_on_merge_request_id on public.ci_pipelines ci_pipelines_1 (cost=0.43..5.86 rows=1 width=318) (actual time=0.025..0.282 rows=91 loops=1)
Index Cond: (ci_pipelines_1.merge_request_id = 46282578)
Filter: ((ci_pipelines_1.project_id = ANY ('{14278368,278964}'::integer[])) AND (ci_pipelines_1.source = 10) AND ((ci_pipelines_1.source = ANY ('{1,2,3,4,5,6,7,8,10,11}'::integer[])) OR (ci_pipelines_1.source IS NULL)))
Rows Removed by Filter: 0
Buffers: shared hit=94
-> Hash Join (cost=5.87..255.88 rows=1 width=318) (actual time=0.280..5.737 rows=154 loops=1)
Hash Cond: (encode(shas_1.sha, 'hex'::text) = (ci_pipelines_2.sha)::text)
Buffers: shared hit=603
-> CTE Scan on shas shas_1 (cost=0.00..200.00 rows=10000 width=32) (actual time=0.001..4.442 rows=1435 loops=1)
Buffers: shared hit=509
-> Hash (cost=5.86..5.86 rows=1 width=318) (actual time=0.242..0.243 rows=91 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 29kB
Buffers: shared hit=94
-> Index Scan using index_ci_pipelines_on_merge_request_id on public.ci_pipelines ci_pipelines_2 (cost=0.43..5.86 rows=1 width=318) (actual time=0.019..0.158 rows=91 loops=1)
Index Cond: (ci_pipelines_2.merge_request_id = 46282578)
Filter: ((ci_pipelines_2.project_id = ANY ('{14278368,278964}'::integer[])) AND (ci_pipelines_2.source = 10) AND ((ci_pipelines_2.source = ANY ('{1,2,3,4,5,6,7,8,10,11}'::integer[])) OR (ci_pipelines_2.source IS NULL)))
Rows Removed by Filter: 0
Buffers: shared hit=94
-> Hash Join (cost=3.63..253.64 rows=1 width=318) (actual time=0.369..0.370 rows=0 loops=1)
Hash Cond: (encode(shas_2.sha, 'hex'::text) = (ci_pipelines_3.sha)::text)
Buffers: shared hit=96
-> CTE Scan on shas shas_2 (cost=0.00..200.00 rows=10000 width=32) (actual time=0.001..0.001 rows=1 loops=1)
-> Hash (cost=3.62..3.62 rows=1 width=318) (actual time=0.314..0.315 rows=0 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 8kB
Buffers: shared hit=96
-> Index Scan using index_ci_pipelines_on_project_idandrefandiddesc on public.ci_pipelines ci_pipelines_3 (cost=0.57..3.62 rows=1 width=318) (actual time=0.313..0.313 rows=0 loops=1)
Index Cond: ((ci_pipelines_3.project_id = 14278368) AND ((ci_pipelines_3.ref)::text = '25486-batch-suggestions'::text))
Filter: ((NOT ci_pipelines_3.tag) AND ((ci_pipelines_3.source = ANY ('{1,2,3,4,5,6,7,8,10,11}'::integer[])) OR (ci_pipelines_3.source IS NULL)) AND ((ci_pipelines_3.source = ANY ('{1,2,3,4,5,6,7,8,9,11,12,13}'::integer[])) OR (ci_pipelines_3.source IS NULL)))
Rows Removed by Filter: 91
Buffers: shared hit=96
After
Query
WITH "shas" AS (
SELECT
"merge_request_diff_commits"."sha"
FROM
"merge_request_diff_commits"
WHERE
"merge_request_diff_commits"."merge_request_diff_id" IN (
SELECT
"merge_request_diffs"."id"
FROM
"merge_request_diffs"
WHERE
"merge_request_diffs"."merge_request_id" = 46282578
ORDER BY
"merge_request_diffs"."id" DESC
LIMIT
100
)
LIMIT
10000
)
SELECT
"ci_pipelines".*
FROM
(
(
SELECT
"ci_pipelines".*
FROM
"ci_pipelines"
WHERE
"ci_pipelines"."source" = 10
AND "ci_pipelines"."merge_request_id" = 46282578
AND "ci_pipelines"."project_id" IN (14278368, 278964)
)
UNION
(
SELECT
"ci_pipelines".*
FROM
"ci_pipelines"
INNER JOIN "shas" ON encode("shas"."sha", 'hex') = "ci_pipelines"."sha"
WHERE
"ci_pipelines"."project_id" = 14278368
AND (
"ci_pipelines"."source" IN (1, 2, 3, 4, 5, 6, 7, 8, 11)
OR "ci_pipelines"."source" IS NULL
)
AND "ci_pipelines"."ref" = '25486-batch-suggestions'
AND "ci_pipelines"."tag" = FALSE
)
) ci_pipelines
ORDER BY
CASE ci_pipelines.source WHEN (10) THEN 0 ELSE 1 END,
ci_pipelines.id DESC
Performance
Time: 5.589 ms
- planning: 2.491 ms
- execution: 3.098 ms
- I/O read: 0.000 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 309 (~2.40 MiB) from the buffer pool
- reads: 0 from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0
Sort (cost=7530.31..7530.32 rows=2 width=1788) (actual time=2.674..2.692 rows=91 loops=1)
Sort Key: (CASE ci_pipelines.source WHEN 10 THEN 0 ELSE 1 END), ci_pipelines.id DESC
Sort Method: quicksort Memory: 49kB
Buffers: shared hit=309
CTE shas
-> Limit (cost=4.51..7270.79 rows=10000 width=21) (actual time=0.836..0.838 rows=1 loops=1)
Buffers: shared hit=96
-> Nested Loop (cost=4.51..143905.30 rows=198039 width=21) (actual time=0.834..0.837 rows=1 loops=1)
Buffers: shared hit=96
-> HashAggregate (cost=3.93..4.05 rows=12 width=4) (actual time=0.784..0.786 rows=1 loops=1)
Group Key: merge_request_diffs.id
Buffers: shared hit=91
-> Limit (cost=0.57..3.78 rows=12 width=4) (actual time=0.116..0.715 rows=92 loops=1)
Buffers: shared hit=91
-> Index Only Scan using index_merge_request_diffs_on_merge_request_id_and_id on public.merge_request_diffs (cost=0.57..3.78 rows=12 width=4) (actual time=0.115..0.695 rows=92 loops=1)
Index Cond: (merge_request_diffs.merge_request_id = 46282578)
Heap Fetches: 5
Buffers: shared hit=91
-> Index Scan using merge_request_diff_commits_pkey on public.merge_request_diff_commits (cost=0.58..11826.74 rows=16503 width=25) (actual time=0.044..0.045 rows=1 loops=1)
Index Cond: (merge_request_diff_commits.merge_request_diff_id = merge_request_diffs.id)
Buffers: shared hit=5
-> Subquery Scan on ci_pipelines (cost=259.33..259.51 rows=2 width=1788) (actual time=2.319..2.470 rows=91 loops=1)
Buffers: shared hit=306
-> Unique (cost=259.33..259.48 rows=2 width=1784) (actual time=2.314..2.402 rows=91 loops=1)
Buffers: shared hit=306
-> Sort (cost=259.33..259.34 rows=2 width=1784) (actual time=2.312..2.325 rows=91 loops=1)
Sort Key: ci_pipelines_1.id, ci_pipelines_1.ref, ci_pipelines_1.sha, ci_pipelines_1.before_sha, ci_pipelines_1.created_at, ci_pipelines_1.updated_at, ci_pipelines_1.tag, ci_pipelines_1.yaml_errors, ci_pipelines_1.committed_at, ci_pipelines_1.project_id, ci_pipelines_1.status, ci_pipelines_1.started_at, ci_pipelines_1.finished_at, ci_pipelines_1.duration, ci_pipelines_1.user_id, ci_pipelines_1.lock_version, ci_pipelines_1.auto_canceled_by_id, ci_pipelines_1.pipeline_schedule_id, ci_pipelines_1.source, ci_pipelines_1.protected, ci_pipelines_1.config_source, ci_pipelines_1.failure_reason, ci_pipelines_1.iid, ci_pipelines_1.merge_request_id, ci_pipelines_1.source_sha, ci_pipelines_1.target_sha, ci_pipelines_1.external_pull_request_id, ci_pipelines_1.ci_ref_id, ci_pipelines_1.locked
Sort Method: quicksort Memory: 49kB
Buffers: shared hit=306
-> Append (cost=0.43..259.32 rows=2 width=1784) (actual time=0.026..2.067 rows=91 loops=1)
Buffers: shared hit=286
-> Index Scan using index_ci_pipelines_on_merge_request_id on public.ci_pipelines ci_pipelines_1 (cost=0.43..5.67 rows=1 width=318) (actual time=0.025..0.933 rows=91 loops=1)
Index Cond: (ci_pipelines_1.merge_request_id = 46282578)
Filter: ((ci_pipelines_1.project_id = ANY ('{14278368,278964}'::integer[])) AND (ci_pipelines_1.source = 10))
Rows Removed by Filter: 0
Buffers: shared hit=94
-> Hash Join (cost=3.61..253.62 rows=1 width=318) (actual time=1.109..1.110 rows=0 loops=1)
Hash Cond: (encode(shas.sha, 'hex'::text) = (ci_pipelines_2.sha)::text)
Buffers: shared hit=192
-> CTE Scan on shas (cost=0.00..200.00 rows=10000 width=32) (actual time=0.839..0.839 rows=1 loops=1)
Buffers: shared hit=96
-> Hash (cost=3.60..3.60 rows=1 width=318) (actual time=0.242..0.243 rows=0 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 8kB
Buffers: shared hit=96
-> Index Scan using index_ci_pipelines_on_project_idandrefandiddesc on public.ci_pipelines ci_pipelines_2 (cost=0.57..3.60 rows=1 width=318) (actual time=0.242..0.242 rows=0 loops=1)
Index Cond: ((ci_pipelines_2.project_id = 14278368) AND ((ci_pipelines_2.ref)::text = '25486-batch-suggestions'::text))
Filter: ((NOT ci_pipelines_2.tag) AND ((ci_pipelines_2.source = ANY ('{1,2,3,4,5,6,7,8,11}'::integer[])) OR (ci_pipelines_2.source IS NULL)))
Rows Removed by Filter: 91
Buffers: shared hit=96
Screenshots (strongly suggested)
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry -
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. -
Tested in all supported browsers -
Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
-
Label as security and @ mention @gitlab-com/gl-security/appsec
-
The MR includes necessary changes to maintain consistency between UI, API, email, or other methods -
Security reports checked/validated by a reviewer from the AppSec team
Edited by Mayra Cabrera