Skip to content

Sort dependencies by license

mo khan requested to merge mokhax/422086/sort-by-spdx-id into master

What does this MR do and why?

This change is gated by the group_level_licenses feature flag. This change adds an option to sort dependencies, on the group dependency list, by SPDX identifier. The sbom_occurrences.licenses column is a jsonb column with a schema that resembles the following:

[
  {
    "url": "https://spdx.org/licenses/Apache-2.0.html",
    "name": "Apache 2.0",
    "spdx_identifier": "Apache-2.0"
  },
  {
    "url": "https://spdx.org/licenses/Artistic-2.0.html",
    "name": "Artistic-2.0",
    "spdx_identifier": "Artistic-2.0"
  },
  {
    "url": "https://spdx.org/licenses/BSD-3-Clause.html",
    "name": "New BSD",
    "spdx_identifier": "BSD-3-Clause"
  },
  {
    "url": "https://spdx.org/licenses/GPL-3.0.html",
    "name": "GPLv3",
    "spdx_identifier": "GPL-3.0"
  }
]

The change in this MR attempts to add support for sorting Sbom::Occurence records by licenses. The # of elements in the sbom_occurrences.licenses column can be between 0-N with no upper bound at this time.

The change in this MR attempts to sort records by the first detected license then by the second detected license.

Related to:

Sort by licenses ascending:

SELECT
  sbom_occurrences.*,
  agg_occurrences.occurrence_count,
  agg_occurrences.project_count
FROM
  sbom_occurrences
  INNER JOIN (
    SELECT
      component_id,
      COUNT(DISTINCT id) AS occurrence_count,
      COUNT(DISTINCT project_id) AS project_count
    FROM
      sbom_occurrences
    WHERE
      project_id IN (
        SELECT
          "projects"."id"
        FROM
          "projects"
        WHERE
          "projects"."namespace_id" IN (
            SELECT
              namespaces.traversal_ids[array_length(namespaces.traversal_ids, 1)] AS id
            FROM
              "namespaces"
            WHERE
              "namespaces"."type" = 'Group'
              AND (traversal_ids @> ('{9970}'))))
        GROUP BY
          component_id) agg_occurrences ON sbom_occurrences.component_id = agg_occurrences.component_id
WHERE
  "sbom_occurrences"."project_id" IN (
    SELECT
      "projects"."id"
    FROM
      "projects"
    WHERE
      "projects"."namespace_id" IN (
        SELECT
          namespaces.traversal_ids[array_length(namespaces.traversal_ids, 1)] AS id
        FROM
          "namespaces"
        WHERE
          "namespaces"."type" = 'Group'
          AND (traversal_ids @> ('{9970}'))))
  ORDER BY
    licenses #> '{0,spdx_identifier}' ASC,
    licenses #> '{1,spdx_identifier}' ASC
  LIMIT 20 OFFSET 0

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/21933/commands/71064

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by mo khan

Merge request reports