Fix PipelineSecurityReportFindings query timeout

What does this MR do and why?

Fixes PipelineSecurityReportFindings query timeouts (#574214 (closed)) by switching from offset to keyset pagination.

Offset pagination requires scanning all previous rows (e.g., page 100 with 10 per page scans 1001 rows), causing timeouts on deep pagination.

Keyset pagination uses the last row's values as a cursor, always scanning only the requested page size regardless of depth. This will have the side-effect of breaking any existing queries using the old cursor values. However, the cursor is an opaque value and naturally should only be used inside the session it is provided so this should be an acceptable change.

SQL Queries

Example GraphQL query used (Page 2 with 10 per page):

query {
  project(fullPath: "gitlab-org/gitlab") {
    pipeline(id: "gid://gitlab/Ci::Pipeline/2190280877") {
      securityReportFindings(first: 10, sort: severity_desc, after: <end-cursor> ) {
        edges {
          node {
            dismissalReason
            dismissedAt
            falsePositive
            reportType
            severity
            state
            title
            uuid
          }
        }
        pageInfo {
          endCursor
          hasNextPage
        }
      }
    }
  }
}
GraphQL (FindingsFinder) Before
SELECT 
 "security_findings".* 
FROM 
  "security_scans", 
  unnest('{1,2,4,5,6,7}' :: smallint[]) AS "severities" ("severity"), 
  LATERAL (
    SELECT 
      "security_findings".* 
    FROM 
      "security_findings" 
      LEFT OUTER JOIN "vulnerability_occurrences" ON "vulnerability_occurrences"."uuid" = "security_findings"."uuid" 
    WHERE 
      (
        "security_findings"."scan_id" = "security_scans"."id"
      ) 
      AND (
        COALESCE(
          "vulnerability_occurrences"."severity", 
          "security_findings"."severity"
        ) = "severities"."severity"
      ) 
      AND "security_findings"."partition_number" = 2 
      AND "security_findings"."deduplicated" = TRUE 
      AND (
        NOT EXISTS (
          SELECT 
            1 
          FROM 
            "vulnerabilities" 
            INNER JOIN "vulnerability_occurrences" ON "vulnerability_occurrences"."vulnerability_id" = "vulnerabilities"."id" 
          WHERE 
            "vulnerabilities"."state" = 2 
            AND (
              vulnerability_occurrences.uuid = security_findings.uuid
            )
        )
      ) 
    ORDER BY 
      "security_findings"."severity" DESC, 
      "security_findings"."id" ASC 
    LIMIT 
      21 /* NOTE: This is the limit that increases with pagination ((pagesize * num_pages) + 1)*/
  ) AS "security_findings" 
WHERE 
  "security_scans"."pipeline_id" = 645 
  AND "security_scans"."latest" = TRUE 
  AND "security_scans"."status" = 1 
ORDER BY 
  "security_findings"."severity" DESC, 
  "security_findings"."id" ASC 

Query Plan: https://console.postgres.ai/shared/72eb2af8-de6f-4f3a-9e51-3764386fa6e6

GraphQL (FindingsFinder) After
SELECT 
  "security_findings".* 
FROM 
  "security_scans", 
  unnest('{1,2,4,5,6,7}' :: smallint[]) AS "severities" ("severity"), 
  LATERAL (
    SELECT 
      "security_findings".* 
    FROM 
      "security_findings" 
      LEFT OUTER JOIN "vulnerability_occurrences" ON "vulnerability_occurrences"."uuid" = "security_findings"."uuid" 
    WHERE 
      (
        "security_findings"."scan_id" = "security_scans"."id"
      ) 
      AND (
        COALESCE(
          "vulnerability_occurrences"."severity", 
          "security_findings"."severity"
        ) = "severities"."severity"
      ) 
      AND "security_findings"."partition_number" = 2 
      AND "security_findings"."deduplicated" = TRUE 
      AND (
        NOT EXISTS (
          SELECT 
            1 
          FROM 
            "vulnerabilities" 
            INNER JOIN "vulnerability_occurrences" ON "vulnerability_occurrences"."vulnerability_id" = "vulnerabilities"."id" 
          WHERE 
            "vulnerabilities"."state" = 2 
            AND (
              vulnerability_occurrences.uuid = security_findings.uuid
            )
        )
      ) 
    ORDER BY 
      "security_findings"."severity" ASC, 
      "security_findings"."id" ASC 
    LIMIT 
      11 /* Limit applied by keyset_paginate */
  ) AS "security_findings" 
WHERE 
  "security_scans"."pipeline_id" = 645 
  AND "security_scans"."latest" = TRUE 
  AND "security_scans"."status" = 1 
  AND (
    (
      "security_findings"."severity", 
      "security_findings"."id"
    ) > (7, 6607)
  ) 
ORDER BY 
  "security_findings"."severity" ASC, 
  "security_findings"."id" ASC 
LIMIT 
  11 /* Limit applied by the the GraphQL framework */

Query Plan: https://console.postgres.ai/shared/a7a64c25-129f-49f0-b8d7-afd89e5d2560

*Note: In the After query an outer limit of N+1 is applied by GraphQL automatically

References

How to set up and validate locally

To test the pagination:

  1. Run a security scanner pipeline on any project with vulnerabilities and take note of the pipeline ID

  2. Execute the following GraphQL query:

    query {
      project(fullPath: "<project-path>") {
        pipeline(id: "gid://gitlab/Ci::Pipeline/<pipeline-id>") {
          securityReportFindings(first: 10) {
            edges {
              node {
                dismissalReason
                dismissedAt
                falsePositive
                reportType
                severity
                state
                title
                uuid
              }
            }
            pageInfo {
              endCursor
              hasNextPage
            }
          }
        }
      }
    }
  3. Retrieve the second page of results using the endCursor from the first page:

    query {
      project(fullPath: "<project-path>") {
        pipeline(id: "gid://gitlab/Ci::Pipeline/<pipeline-id>") {
          securityReportFindings(first: 10, after: "<end-cursor-value>" ) {
            edges {
              node {
                dismissalReason
                dismissedAt
                falsePositive
                reportType
                severity
                state
                title
                uuid
              }
            }
            pageInfo {
              endCursor
              hasNextPage
            }
          }
        }
      }
    }
  4. Repeat Steps 1-3 using old and new pagination and compare results (they should be equivalent)

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #574214 (closed)

Edited by Harrison Peters

Merge request reports

Loading