Skip to content

Fix dependency aggregation pagination when cursor contains nulls

Brian Williams requested to merge bwill/fix-keyset-pagination into master

What does this MR do and why?

The dependency aggregations currently do not correctly account for nullability of cursor columns. If the cursor contains a null value, then we will attempt to build a query that checks for sbom_occurrences.highest_severity > NULL which always returns 0 rows as NULL cannot be compared with > or <. This MR updates the pagination to correctly consider nullability. Additionally, keyset_paginate is removed from the outer query because the column conditions added by it are redundant and cause the query to waste time performing additional scans. Pagination headers are now manually added in the controller, because they cannot be automatically added by the serializer without keyset_paginate on the relation.

Query Plans

Before: https://console.postgres.ai/gitlab/gitlab-production-main/sessions/29484/commands/91621 (broken, returns 0 rows)

Click to expand
WITH namespaces AS MATERIALIZED (
    SELECT
        namespaces.traversal_ids
    FROM
        namespaces
    WHERE
        namespaces.type = 'Group' AND
        traversal_ids @> '{9970}'
)
SELECT
    outer_occurrences.highest_severity,
    outer_occurrences.component_id,
    outer_occurrences.component_version_id,
    min( outer_occurrences.id )::pg_catalog.int8 AS id,
    min( outer_occurrences.package_manager ) AS package_manager,
    min( outer_occurrences.input_file_path ) AS input_file_path,
    jsonb_agg( outer_occurrences.licenses -> 0 ) AS licenses,
    sum( counts.occurrence_count )::pg_catalog.int4 AS occurrence_count,
    sum( counts.vulnerability_count )::pg_catalog.int4 AS vulnerability_count,
    sum( counts.project_count )::pg_catalog.int4 AS project_count
FROM
    (
        SELECT
            DISTINCT ON ( inner_occurrences.highest_severity, inner_occurrences.component_id, inner_occurrences.component_version_id )
            inner_occurrences.id,
            inner_occurrences.created_at,
            inner_occurrences.updated_at,
            inner_occurrences.component_version_id,
            inner_occurrences.project_id,
            inner_occurrences.pipeline_id,
            inner_occurrences.source_id,
            inner_occurrences.commit_sha,
            inner_occurrences.component_id,
            inner_occurrences.uuid,
            inner_occurrences.package_manager,
            inner_occurrences.component_name,
            inner_occurrences.input_file_path,
            inner_occurrences.licenses,
            inner_occurrences.highest_severity,
            inner_occurrences.vulnerability_count,
            inner_occurrences.source_package_id,
            inner_occurrences.archived,
            inner_occurrences.traversal_ids,
            inner_occurrences.ancestors
        FROM
            namespaces,
            LATERAL (
                SELECT
                    DISTINCT ON ( sbom_occurrences.highest_severity, sbom_occurrences.component_id, sbom_occurrences.component_version_id )
                    sbom_occurrences.id,
                    sbom_occurrences.created_at,
                    sbom_occurrences.updated_at,
                    sbom_occurrences.component_version_id,
                    sbom_occurrences.project_id,
                    sbom_occurrences.pipeline_id,
                    sbom_occurrences.source_id,
                    sbom_occurrences.commit_sha,
                    sbom_occurrences.component_id,
                    sbom_occurrences.uuid,
                    sbom_occurrences.package_manager,
                    sbom_occurrences.component_name,
                    sbom_occurrences.input_file_path,
                    sbom_occurrences.licenses,
                    sbom_occurrences.highest_severity,
                    sbom_occurrences.vulnerability_count,
                    sbom_occurrences.source_package_id,
                    sbom_occurrences.archived,
                    sbom_occurrences.traversal_ids,
                    sbom_occurrences.ancestors
                FROM
                    sbom_occurrences
                WHERE
                    sbom_occurrences.traversal_ids = namespaces.traversal_ids::pg_catalog.int8[] AND
                    sbom_occurrences.archived = false AND
                    (
                        sbom_occurrences.highest_severity,
                        sbom_occurrences.component_id,
                        sbom_occurrences.component_version_id
                    ) < ( NULL, 153, 1775 )
                ORDER BY
                    sbom_occurrences.highest_severity DESC,
                    sbom_occurrences.component_id DESC,
                    sbom_occurrences.component_version_id DESC
                LIMIT 21
            ) AS inner_occurrences
        ORDER BY
            inner_occurrences.highest_severity DESC,
            inner_occurrences.component_id DESC,
            inner_occurrences.component_version_id DESC
        LIMIT 21
    ) AS outer_occurrences,
    LATERAL (
        SELECT
            count( project_id ) AS occurrence_count,
            count( project_id ) AS project_count,
            sum( vulnerability_count ) AS vulnerability_count
        FROM
            sbom_occurrences
        WHERE
            traversal_ids >= '{9970}' AND
            traversal_ids < '{9971}' AND
            sbom_occurrences.archived = false AND
            sbom_occurrences.component_version_id = outer_occurrences.component_version_id
    ) AS counts
WHERE
    (
        outer_occurrences.highest_severity,
        outer_occurrences.component_id,
        outer_occurrences.component_version_id
    ) < ( NULL, 153, 1775 )
GROUP BY
    outer_occurrences.highest_severity,
    outer_occurrences.component_id,
    outer_occurrences.component_version_id
ORDER BY
    min( outer_occurrences.highest_severity ) DESC,
    min( outer_occurrences.component_id ) DESC,
    min( outer_occurrences.component_version_id ) DESC
LIMIT 21;

After: https://console.postgres.ai/gitlab/gitlab-production-main/sessions/29484/commands/91620

Click to expand
WITH namespaces AS MATERIALIZED (
    SELECT
        namespaces.traversal_ids
    FROM
        namespaces
    WHERE
        namespaces.type = 'Group' AND
        traversal_ids @> '{9970}'
)
SELECT
    outer_occurrences.highest_severity,
    outer_occurrences.component_id,
    outer_occurrences.component_version_id,
    min( outer_occurrences.id )::pg_catalog.int8 AS id,
    min( outer_occurrences.package_manager ) AS package_manager,
    min( outer_occurrences.input_file_path ) AS input_file_path,
    jsonb_agg( outer_occurrences.licenses -> 0 ) AS licenses,
    sum( counts.occurrence_count )::pg_catalog.int4 AS occurrence_count,
    sum( counts.vulnerability_count )::pg_catalog.int4 AS vulnerability_count,
    sum( counts.project_count )::pg_catalog.int4 AS project_count
FROM
    (
        SELECT
            DISTINCT ON ( inner_occurrences.highest_severity, inner_occurrences.component_id, inner_occurrences.component_version_id )
            inner_occurrences.id,
            inner_occurrences.created_at,
            inner_occurrences.updated_at,
            inner_occurrences.component_version_id,
            inner_occurrences.project_id,
            inner_occurrences.pipeline_id,
            inner_occurrences.source_id,
            inner_occurrences.commit_sha,
            inner_occurrences.component_id,
            inner_occurrences.uuid,
            inner_occurrences.package_manager,
            inner_occurrences.component_name,
            inner_occurrences.input_file_path,
            inner_occurrences.licenses,
            inner_occurrences.highest_severity,
            inner_occurrences.vulnerability_count,
            inner_occurrences.source_package_id,
            inner_occurrences.archived,
            inner_occurrences.traversal_ids,
            inner_occurrences.ancestors
        FROM
            namespaces,
            LATERAL (
                SELECT
                    DISTINCT ON ( sbom_occurrences.highest_severity, sbom_occurrences.component_id, sbom_occurrences.component_version_id )
                    sbom_occurrences.id,
                    sbom_occurrences.created_at,
                    sbom_occurrences.updated_at,
                    sbom_occurrences.component_version_id,
                    sbom_occurrences.project_id,
                    sbom_occurrences.pipeline_id,
                    sbom_occurrences.source_id,
                    sbom_occurrences.commit_sha,
                    sbom_occurrences.component_id,
                    sbom_occurrences.uuid,
                    sbom_occurrences.package_manager,
                    sbom_occurrences.component_name,
                    sbom_occurrences.input_file_path,
                    sbom_occurrences.licenses,
                    sbom_occurrences.highest_severity,
                    sbom_occurrences.vulnerability_count,
                    sbom_occurrences.source_package_id,
                    sbom_occurrences.archived,
                    sbom_occurrences.traversal_ids,
                    sbom_occurrences.ancestors
                FROM
                    sbom_occurrences
                WHERE
                    sbom_occurrences.traversal_ids = namespaces.traversal_ids::pg_catalog.int8[] AND
                    sbom_occurrences.archived = false AND
                    (
                        (
                            NULL IS NULL AND
                            sbom_occurrences.highest_severity IS NULL AND
                            sbom_occurrences.component_id < 153
                        ) OR
                        (
                            NULL IS NULL AND
                            sbom_occurrences.highest_severity IS NULL AND
                            sbom_occurrences.component_id = 153 AND
                            1775 IS NULL AND
                            sbom_occurrences.component_version_id IS NOT NULL
                        ) OR
                        (
                            NULL IS NULL AND
                            sbom_occurrences.highest_severity IS NULL AND
                            sbom_occurrences.component_id = 153 AND
                            1775 IS NOT NULL AND
                            sbom_occurrences.component_version_id < 1775
                        ) OR
                        (
                            NULL IS NULL AND
                            sbom_occurrences.highest_severity IS NOT NULL
                        ) OR
                        (
                            NULL IS NOT NULL AND
                            sbom_occurrences.highest_severity < NULL
                        ) OR
                        (
                            NULL IS NOT NULL AND
                            sbom_occurrences.highest_severity IS NULL AND
                            sbom_occurrences.component_id < 153
                        ) OR
                        (
                            NULL IS NOT NULL AND
                            sbom_occurrences.highest_severity IS NULL AND
                            sbom_occurrences.component_id = 153 AND
                            1775 IS NULL AND
                            sbom_occurrences.component_version_id IS NOT NULL
                        ) OR
                        (
                            NULL IS NOT NULL AND
                            sbom_occurrences.highest_severity IS NULL AND
                            sbom_occurrences.component_id = 153 AND
                            1775 IS NOT NULL AND
                            sbom_occurrences.component_version_id < 1775
                        )
                    )
                ORDER BY
                    sbom_occurrences.highest_severity DESC,
                    sbom_occurrences.component_id DESC,
                    sbom_occurrences.component_version_id DESC
                LIMIT 21
            ) AS inner_occurrences
        ORDER BY
            inner_occurrences.highest_severity DESC,
            inner_occurrences.component_id DESC,
            inner_occurrences.component_version_id DESC
        LIMIT 21
    ) AS outer_occurrences,
    LATERAL (
        SELECT
            count( project_id ) AS occurrence_count,
            count( project_id ) AS project_count,
            sum( vulnerability_count ) AS vulnerability_count
        FROM
            sbom_occurrences
        WHERE
            traversal_ids >= '{9970}' AND
            traversal_ids < '{9971}' AND
            sbom_occurrences.archived = false AND
            sbom_occurrences.component_version_id = outer_occurrences.component_version_id
    ) AS counts
GROUP BY
    outer_occurrences.highest_severity,
    outer_occurrences.component_id,
    outer_occurrences.component_version_id
ORDER BY
    min( outer_occurrences.highest_severity ) DESC,
    min( outer_occurrences.component_id ) DESC,
    min( outer_occurrences.component_version_id ) DESC;

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

Screen_Recording_2024-06-26_at_5.31.53_PM

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

  1. Enable the rewrite_sbom_occurrences_query feature flag
  2. Visit a project with data in the dependency list
  3. Observe that next and previous page buttons work and use cursor-based keyset pagination
Edited by Brian Williams

Merge request reports