Skip to content

Use `jsonb_to_recordset` to fetch a unique list of licenses

mo khan requested to merge mokhax/422293/sbom-licenses-jsonb_to_recordset into master

What does this MR do and why?

This change alters the query to fetch all the detected licenses within a group to use the jsonb_to_recordset function. This change was inspired by a discussion that started in !132597 (comment 1591936459).

#422293 (closed)

Before:

SELECT DISTINCT "sbom_occurrences"."licenses"
FROM "sbom_occurrences"
WHERE "sbom_occurrences"."project_id" IN (
  SELECT "projects"."id"
  FROM "projects"
  WHERE "projects"."namespace_id" IN (
    SELECT namespaces.traversal_ids[array_length(namespaces.traversal_ids, 1)] AS id
    FROM "namespaces"
    WHERE "namespaces"."type" = 'Group'
    AND (traversal_ids @> ('{9970}'))
  )
  AND "projects"."marked_for_deletion_at" IS NULL
  AND "projects"."pending_delete" = FALSE
)
ORDER BY "sbom_occurrences"."licenses" ASC
LIMIT 100
Time: 349.122 ms
  - planning: 13.196 ms
  - execution: 335.926 ms
    - I/O read: 0.000 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 64458 (~503.60 MiB) from the buffer pool
  - reads: 0 from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/23046/commands/74214

After:

SELECT DISTINCT "spdx_identifier", "name", "url"
FROM "sbom_occurrences"
LEFT JOIN LATERAL jsonb_to_recordset(sbom_occurrences.licenses) AS sbom_licenses(spdx_identifier TEXT, name TEXT, url TEXT) ON TRUE
WHERE "sbom_occurrences"."project_id" IN (
  SELECT "projects"."id"
  FROM "projects"
  WHERE "projects"."namespace_id" IN (
    SELECT namespaces.traversal_ids[array_length(namespaces.traversal_ids, 1)] AS id
    FROM "namespaces"
    WHERE "namespaces"."type" = 'Group'
    AND (traversal_ids @> ('{9970}'))
  )
  AND "projects"."marked_for_deletion_at" IS NULL
  AND "projects"."pending_delete" = FALSE
)
AND (licenses != '[]')
AND "sbom_licenses"."spdx_identifier" IS NOT NULL
ORDER BY sbom_licenses.spdx_identifier ASC
LIMIT 100;
Time: 305.268 ms
  - planning: 13.737 ms
  - execution: 291.531 ms
    - I/O read: 0.000 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 64456 (~503.60 MiB) from the buffer pool
  - reads: 0 from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/23046/commands/74213

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by mo khan

Merge request reports