Add workers to sync `sbom_occurrences.traversal_ids`
What does this MR do and why?
Add workers to update sbom_occurrences.traversal_ids
with project.namespace.traversal_ids
when a project or namespace is transferred.
Subscriptions will be added in a separate MR to facilitate canary deployments.
Relates to: #437636 (closed)
SQL Queries
Get projects to update for group: https://console.postgres.ai/shared/ebf6795e-bb1d-4dca-80d5-979327475344
SELECT
projects.id
FROM
projects
WHERE
projects.namespace_id IN (
SELECT
namespaces.traversal_ids[array_length(
namespaces.traversal_ids,
1
)] AS id
FROM
namespaces
WHERE
namespaces.type = 'Group' AND
traversal_ids @> '{9970}'
) AND
EXISTS (
SELECT
1
FROM
sbom_occurrences
WHERE
sbom_occurrences.project_id = projects.id
);
Nested Loop Semi Join (cost=720.75..7930.33 rows=11 width=4) (actual time=1447.214..7299.702 rows=318 loops=1)
Buffers: shared hit=18432 read=4916
I/O Timings: read=7146.118 write=0.000
-> Nested Loop (cost=720.18..2074.13 rows=10096 width=4) (actual time=1431.755..3493.923 rows=3742 loops=1)
Buffers: shared hit=6000 read=2044
I/O Timings: read=3426.796 write=0.000
-> HashAggregate (cost=719.62..723.71 rows=409 width=28) (actual time=1426.498..1428.425 rows=821 loops=1)
Group Key: namespaces.traversal_ids[array_length(namespaces.traversal_ids, 1)]
Buffers: shared hit=1 read=804
I/O Timings: read=1407.347 write=0.000
-> Bitmap Heap Scan on public.namespaces (cost=79.90..718.59 rows=409 width=28) (actual time=35.991..1423.489 rows=821 loops=1)
Buffers: shared hit=1 read=804
I/O Timings: read=1407.347 write=0.000
-> Bitmap Index Scan using index_namespaces_on_traversal_ids_for_groups (cost=0.00..79.80 rows=409 width=0) (actual time=28.940..28.941 rows=822 loops=1)
Index Cond: (namespaces.traversal_ids @> '{9970}'::integer[])
Buffers: shared hit=1 read=50
I/O Timings: read=27.492 write=0.000
-> Index Only Scan using index_projects_on_namespace_id_and_id on public.projects (cost=0.57..3.05 rows=25 width=8) (actual time=1.608..2.509 rows=5 loops=821)
Index Cond: (projects.namespace_id = (namespaces.traversal_ids)[array_length(namespaces.traversal_ids, 1)])
Heap Fetches: 366
Buffers: shared hit=5999 read=1240
I/O Timings: read=2019.449 write=0.000
-> Index Only Scan using idx_sbom_occurrences_on_project_id_and_source_id on public.sbom_occurrences (cost=0.56..12.52 rows=683 width=8) (actual time=1.015..1.015 rows=0 loops=3742)
Index Cond: (sbom_occurrences.project_id = projects.id)
Heap Fetches: 37
Buffers: shared hit=12432 read=2872
I/O Timings: read=3719.322 write=0.000
Time: 8.240 s
- planning: 939.773 ms
- execution: 7.300 s
- I/O read: 7.146 s
- I/O write: 0.000 ms
Shared buffers:
- hits: 18432 (~144.00 MiB) from the buffer pool
- reads: 4916 (~38.40 MiB) from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0
Update sbom_occurrences: https://console.postgres.ai/shared/dfe2aa75-1b63-45f5-b598-398ade8a52c3
UPDATE sbom_occurrences
SET
traversal_ids = '{9970}'
WHERE
sbom_occurrences.project_id = 278964 AND
sbom_occurrences.id >= 3469759181 AND
sbom_occurrences.id < 3469759472;
ModifyTable on public.sbom_occurrences (cost=0.56..3.58 rows=0 width=0) (actual time=496.756..496.758 rows=0 loops=1)
Buffers: shared hit=3395 read=480 dirtied=345
I/O Timings: read=479.713 write=0.000
-> Index Scan using index_sbom_occurrences_on_project_id_and_id on public.sbom_occurrences (cost=0.56..3.58 rows=1 width=38) (actual time=9.889..101.314 rows=99 loops=1)
Index Cond: ((sbom_occurrences.project_id = 278964) AND (sbom_occurrences.id >= '3469759181'::bigint) AND (sbom_occurrences.id < '3469759472'::bigint))
Buffers: shared hit=22 read=110 dirtied=105
I/O Timings: read=98.563 write=0.000
Time: 499.085 ms
- planning: 2.252 ms
- execution: 496.833 ms
- I/O read: 479.713 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 3395 (~26.50 MiB) from the buffer pool
- reads: 480 (~3.80 MiB) from the OS file cache, including disk I/O
- dirtied: 345 (~2.70 MiB)
- writes: 0
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Edited by Brian Williams