Fix PipelineSecurityReportFindings query timeout
What does this MR do and why?
Fixes PipelineSecurityReportFindings query timeouts (#574214 (closed)) by switching from offset to keyset pagination.
Offset pagination requires scanning all previous rows (e.g., page 100 with 10 per page scans 1001 rows), causing timeouts on deep pagination.
Keyset pagination uses the last row's values as a cursor, always scanning only the requested page size regardless of depth. This will have the side-effect of breaking any existing queries using the old cursor values. However, the cursor is an opaque value and naturally should only be used inside the session it is provided so this should be an acceptable change.
SQL Queries
Example GraphQL query used (Page 2 with 10 per page):
query {
project(fullPath: "gitlab-org/gitlab") {
pipeline(id: "gid://gitlab/Ci::Pipeline/2190280877") {
securityReportFindings(first: 10, sort: severity_desc, after: <end-cursor> ) {
edges {
node {
dismissalReason
dismissedAt
falsePositive
reportType
severity
state
title
uuid
}
}
pageInfo {
endCursor
hasNextPage
}
}
}
}
}
GraphQL (FindingsFinder) Before
SELECT
"security_findings".*
FROM
"security_scans",
unnest('{1,2,4,5,6,7}' :: smallint[]) AS "severities" ("severity"),
LATERAL (
SELECT
"security_findings".*
FROM
"security_findings"
LEFT OUTER JOIN "vulnerability_occurrences" ON "vulnerability_occurrences"."uuid" = "security_findings"."uuid"
WHERE
(
"security_findings"."scan_id" = "security_scans"."id"
)
AND (
COALESCE(
"vulnerability_occurrences"."severity",
"security_findings"."severity"
) = "severities"."severity"
)
AND "security_findings"."partition_number" = 2
AND "security_findings"."deduplicated" = TRUE
AND (
NOT EXISTS (
SELECT
1
FROM
"vulnerabilities"
INNER JOIN "vulnerability_occurrences" ON "vulnerability_occurrences"."vulnerability_id" = "vulnerabilities"."id"
WHERE
"vulnerabilities"."state" = 2
AND (
vulnerability_occurrences.uuid = security_findings.uuid
)
)
)
ORDER BY
"security_findings"."severity" DESC,
"security_findings"."id" ASC
LIMIT
21 /* NOTE: This is the limit that increases with pagination ((pagesize * num_pages) + 1)*/
) AS "security_findings"
WHERE
"security_scans"."pipeline_id" = 645
AND "security_scans"."latest" = TRUE
AND "security_scans"."status" = 1
ORDER BY
"security_findings"."severity" DESC,
"security_findings"."id" ASC
Query Plan: https://console.postgres.ai/shared/72eb2af8-de6f-4f3a-9e51-3764386fa6e6
GraphQL (FindingsFinder) After
SELECT
"security_findings".*
FROM
"security_scans",
unnest('{1,2,4,5,6,7}' :: smallint[]) AS "severities" ("severity"),
LATERAL (
SELECT
"security_findings".*
FROM
"security_findings"
LEFT OUTER JOIN "vulnerability_occurrences" ON "vulnerability_occurrences"."uuid" = "security_findings"."uuid"
WHERE
(
"security_findings"."scan_id" = "security_scans"."id"
)
AND (
COALESCE(
"vulnerability_occurrences"."severity",
"security_findings"."severity"
) = "severities"."severity"
)
AND "security_findings"."partition_number" = 2
AND "security_findings"."deduplicated" = TRUE
AND (
NOT EXISTS (
SELECT
1
FROM
"vulnerabilities"
INNER JOIN "vulnerability_occurrences" ON "vulnerability_occurrences"."vulnerability_id" = "vulnerabilities"."id"
WHERE
"vulnerabilities"."state" = 2
AND (
vulnerability_occurrences.uuid = security_findings.uuid
)
)
)
ORDER BY
"security_findings"."severity" ASC,
"security_findings"."id" ASC
LIMIT
11 /* Limit applied by keyset_paginate */
) AS "security_findings"
WHERE
"security_scans"."pipeline_id" = 645
AND "security_scans"."latest" = TRUE
AND "security_scans"."status" = 1
AND (
(
"security_findings"."severity",
"security_findings"."id"
) > (7, 6607)
)
ORDER BY
"security_findings"."severity" ASC,
"security_findings"."id" ASC
LIMIT
11 /* Limit applied by the the GraphQL framework */
Query Plan: https://console.postgres.ai/shared/a7a64c25-129f-49f0-b8d7-afd89e5d2560
*Note: In the After query an outer limit of N+1 is applied by GraphQL automatically
References
How to set up and validate locally
To test the pagination:
-
Run a security scanner pipeline on any project with vulnerabilities and take note of the pipeline ID
-
Execute the following GraphQL query:
query { project(fullPath: "<project-path>") { pipeline(id: "gid://gitlab/Ci::Pipeline/<pipeline-id>") { securityReportFindings(first: 10) { edges { node { dismissalReason dismissedAt falsePositive reportType severity state title uuid } } pageInfo { endCursor hasNextPage } } } } } -
Retrieve the second page of results using the endCursor from the first page:
query { project(fullPath: "<project-path>") { pipeline(id: "gid://gitlab/Ci::Pipeline/<pipeline-id>") { securityReportFindings(first: 10, after: "<end-cursor-value>" ) { edges { node { dismissalReason dismissedAt falsePositive reportType severity state title uuid } } pageInfo { endCursor hasNextPage } } } } } -
Repeat Steps 1-3 using old and new pagination and compare results (they should be equivalent)
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Related to #574214 (closed)