Skip to content

Use a CTE to speed up the Group#sbom_occurrences query

What does this MR do and why?

This change adds a slight optimization to the Group#sbom_occurrences query.

Before:

SELECT sbom_occurrences.*, agg_occurrences.occurrence_count, agg_occurrences.project_count
FROM sbom_occurrences
INNER JOIN (
  SELECT component_id,
        COUNT(DISTINCT id) AS occurrence_count,
        COUNT(DISTINCT project_id) AS project_count
  FROM sbom_occurrences
  WHERE project_id IN (
    SELECT "projects"."id" FROM "projects"
    WHERE "projects"."namespace_id" IN (
      SELECT namespaces.traversal_ids[array_length(namespaces.traversal_ids, 1)] AS id
      FROM "namespaces"
      WHERE "namespaces"."type" = 'Group'
      AND (traversal_ids @> ('{6543}'))
    )
  )
  GROUP BY component_id
) agg_occurrences ON sbom_occurrences.component_id = agg_occurrences.component_id
WHERE "sbom_occurrences"."project_id" IN (
  SELECT "projects"."id"
  FROM "projects"
  WHERE "projects"."namespace_id" IN (
    SELECT namespaces.traversal_ids[array_length(namespaces.traversal_ids, 1)] AS id
    FROM "namespaces"
    WHERE "namespaces"."type" = 'Group' AND (traversal_ids @> ('{6543}'))
  )
) 
LIMIT 20 OFFSET 0;
Time: 9.927 s
  - planning: 4.170 ms
  - execution: 9.923 s
    - I/O read: 6.137 s
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 5882833 (~44.90 GiB) from the buffer pool
  - reads: 105769 (~826.30 MiB) from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/22171/commands/71698

After:

WITH "our_occurrences" AS MATERIALIZED (
  SELECT "sbom_occurrences".*
  FROM "sbom_occurrences"
  WHERE "sbom_occurrences"."project_id" IN (
    SELECT "projects"."id"
    FROM "projects"
    WHERE "projects"."namespace_id" IN (
      SELECT namespaces.traversal_ids[array_length(namespaces.traversal_ids, 1)] AS id
      FROM "namespaces"
      WHERE "namespaces"."type" = 'Group'
      AND (traversal_ids @> ('{6543}'))
    )
    AND "projects"."marked_for_deletion_at" IS NULL
    AND "projects"."pending_delete" = FALSE
  )
)
SELECT sbom_occurrences.*, agg_occurrences.occurrence_count, agg_occurrences.project_count
FROM "sbom_occurrences"
INNER JOIN (
  SELECT component_id,
        COUNT(DISTINCT id) AS occurrence_count,
        COUNT(DISTINCT project_id) AS project_count
  FROM our_occurrences
  GROUP BY component_id
) agg_occurrences ON sbom_occurrences.component_id = agg_occurrences.component_id
WHERE "sbom_occurrences"."id" IN (
  SELECT "our_occurrences"."id" FROM "our_occurrences"
)
LIMIT 20 OFFSET 0;
Time: 262.699 ms
  - planning: 2.672 ms
  - execution: 260.027 ms
    - I/O read: 0.000 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 52188 (~407.70 MiB) from the buffer pool
  - reads: 0 from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/22171/commands/71690

#422086 (closed)

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by mo khan

Merge request reports