Skip to content

Ensure packager ordering is consistent

Brian Williams requested to merge bwill/sort-packagers-in-consistent-order into master

What does this MR do and why?

It's common for there to be only 1-3 packagers in the dependency list. This means that when sorting by packager, the ordering is essentially random because all of the rows have the same value for it. This change adds composite sorting on (packager, name) to ensure that the ordering remains consistent.

SQL Queries

It is known that this query performs poorly for large groups. We are attempting to optimize / rewrite it via #427783 (closed). I don't expect that this change should make it perform any worse than it already does.

Before: https://console.postgres.ai/shared/284489e2-debf-4a97-ab98-fc7ae3104da9

WITH our_occurrences AS MATERIALIZED (
    SELECT
        DISTINCT ON ( sbom_occurrences.component_version_id )
        sbom_occurrences.*,
        count( sbom_occurrences.id ) OVER ( PARTITION BY sbom_occurrences.component_version_id ) AS occurrence_count,
        dense_rank() OVER (
            PARTITION BY sbom_occurrences.component_version_id
            ORDER BY project_id
        ) AS project_count
    FROM
        sbom_occurrences
    WHERE
        sbom_occurrences.project_id IN (
            SELECT
                projects.id
            FROM
                projects
            WHERE
                projects.namespace_id IN (
                    SELECT
                        namespaces.traversal_ids[array_length(
                            namespaces.traversal_ids,
                            1
                        )] AS id
                    FROM
                        namespaces
                    WHERE
                        namespaces.type = 'Group' AND
                        traversal_ids @> '{9970}'
                ) AND
                projects.marked_for_deletion_at IS NULL AND
                projects.pending_delete = false
        ) AND
        sbom_occurrences.component_version_id IS NOT NULL
    ORDER BY
        sbom_occurrences.component_version_id DESC,
        project_count DESC,
        sbom_occurrences.id DESC
)
SELECT
    id,
    created_at,
    updated_at,
    component_version_id,
    project_id,
    pipeline_id,
    source_id,
    commit_sha,
    component_id,
    uuid,
    package_manager,
    component_name,
    input_file_path,
    licenses,
    highest_severity,
    vulnerability_count,
    occurrence_count,
    project_count
FROM
    our_occurrences AS sbom_occurrences
ORDER BY
    package_manager ASC;

After: https://console.postgres.ai/shared/a9e9fbb7-3fe7-4c55-b379-92c2de32f67c

WITH our_occurrences AS MATERIALIZED (
    SELECT
        DISTINCT ON ( sbom_occurrences.component_version_id )
        sbom_occurrences.*,
        count( sbom_occurrences.id ) OVER ( PARTITION BY sbom_occurrences.component_version_id ) AS occurrence_count,
        dense_rank() OVER (
            PARTITION BY sbom_occurrences.component_version_id
            ORDER BY project_id
        ) AS project_count
    FROM
        sbom_occurrences
    WHERE
        sbom_occurrences.project_id IN (
            SELECT
                projects.id
            FROM
                projects
            WHERE
                projects.namespace_id IN (
                    SELECT
                        namespaces.traversal_ids[array_length(
                            namespaces.traversal_ids,
                            1
                        )] AS id
                    FROM
                        namespaces
                    WHERE
                        namespaces.type = 'Group' AND
                        traversal_ids @> '{9970}'
                ) AND
                projects.marked_for_deletion_at IS NULL AND
                projects.pending_delete = false
        ) AND
        sbom_occurrences.component_version_id IS NOT NULL
    ORDER BY
        sbom_occurrences.component_version_id DESC,
        project_count DESC,
        sbom_occurrences.id DESC
)
SELECT
    id,
    created_at,
    updated_at,
    component_version_id,
    project_id,
    pipeline_id,
    source_id,
    commit_sha,
    component_id,
    uuid,
    package_manager,
    component_name,
    input_file_path,
    licenses,
    highest_severity,
    vulnerability_count,
    occurrence_count,
    project_count
FROM
    our_occurrences AS sbom_occurrences
ORDER BY
    package_manager ASC,
    component_name ASC;

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

Before After
Screenshot_2023-12-11_at_4.15.17_PM Screenshot_2023-12-11_at_4.16.40_PM

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

group = Group.find_by_full_path('group_having_dependencies')
::Sbom::DependenciesFinder.new(group, params: { sort_by: 'packager', sort: 'asc' }).execute

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Brian Williams

Merge request reports