Fix Tool Coverage data update when all projects are archived
What does this MR do and why?
This MR fixes a bug where Tool Coverage statistics were not properly updated when archiving the last project in a namespace/subgroup. The issue occurred specifically when all projects in a namespace became archived, leaving the analyzer namespace statistics with stale data instead of being reset to zero.
Problem Description
When projects are archived, the Security::AnalyzerNamespaceStatuses::AdjustmentService
is responsible for recalculating namespace-level analyzer statistics by aggregating data from unarchived projects. However, there was a critical gap in the SQL query logic:
Before the fix:
- The main query used a
LEFT JOIN
withanalyzer_project_statuses
filtered byarchived = FALSE
- When all projects were archived, this join returned no rows
- The
HAVING count(analyzer_project_statuses.analyzer_type) > 0
clause filtered out empty results - This meant existing
analyzer_namespace_statuses
records were never updated when all projects became archived - The namespace statistics remained at their previous values instead of being reset to zero
Solution
The fix adds a UNION ALL
clause to handle the edge case where all projects are archived:
UNION ALL
-- Handle case where all projects are archived: return 0 counts for existing analyzer types
SELECT DISTINCT
analyzer_namespace_statuses.analyzer_type,
0 AS success,
0 AS failure,
namespace_data.traversal_ids,
namespace_data.namespace_id,
now() AS created_at,
now() AS updated_at
FROM namespace_data
CROSS JOIN analyzer_namespace_statuses
WHERE analyzer_namespace_statuses.namespace_id = namespace_data.namespace_id
AND NOT EXISTS (
SELECT 1
FROM analyzer_project_statuses
WHERE analyzer_project_statuses.archived = FALSE
AND analyzer_project_statuses.analyzer_type = analyzer_namespace_statuses.analyzer_type
AND analyzer_project_statuses.traversal_ids >= namespace_data.traversal_ids
AND analyzer_project_statuses.traversal_ids < namespace_data.next_traversal_id
)
Changelog: fixed EE: true
References
#572794
How to set up and validate locally
Before the fix:
- Create a subgroup
- Import first project
- Import second project
- Run pipeline for first project → "Tool Coverage" stats update correctly
✅ - Run pipeline for second project → "Tool Coverage" stats update correctly
✅ - Archive the first project → "Tool Coverage" stats update correctly
✅ - Archive the second (last) project → "Tool Coverage" stats remain unchanged
❌
After the fix:
- Create a subgroup
- Import first project
- Import second project
- Run pipeline for first project → "Tool Coverage" stats update correctly
✅ - Run pipeline for second project → "Tool Coverage" stats update correctly
✅ - Archive the first project → "Tool Coverage" stats update correctly
✅ - Archive the second (last) project → "Tool Coverage" stats are updated and the counters will be 0 (zero)
✅
Query plans
Before
Raw SQL
WITH new_values AS (
WITH namespace_data (namespace_id, traversal_ids, next_traversal_id) AS (
VALUES (23, ARRAY[23]::bigint[], ARRAY[24]::bigint[]), (24, ARRAY[23, 24]::bigint[], ARRAY[23, 25]::bigint[]), (25, ARRAY[23, 24, 25]::bigint[], ARRAY[23, 24, 26]::bigint[])
)
SELECT
analyzer_project_statuses.analyzer_type,
COALESCE(SUM((status = 1)::int), 0) AS success,
COALESCE(SUM((status = 2)::int), 0) AS failure,
namespace_data.traversal_ids as traversal_ids,
namespace_data.namespace_id as namespace_id,
now() AS created_at,
now() AS updated_at
FROM namespace_data
LEFT JOIN analyzer_project_statuses
ON analyzer_project_statuses.archived = FALSE
AND analyzer_project_statuses.traversal_ids >= namespace_data.traversal_ids
AND analyzer_project_statuses.traversal_ids < namespace_data.next_traversal_id
GROUP BY namespace_data.traversal_ids, namespace_id, analyzer_project_statuses.analyzer_type
HAVING count(analyzer_project_statuses.analyzer_type) > 0
), old_values AS (
SELECT
namespace_id,
traversal_ids,
analyzer_type,
success,
failure
FROM analyzer_namespace_statuses
WHERE namespace_id IN (SELECT namespace_id FROM new_values)
), upserted AS (
INSERT INTO analyzer_namespace_statuses
(analyzer_type, success, failure, traversal_ids, namespace_id, created_at, updated_at)
(SELECT analyzer_type, success, failure, traversal_ids, namespace_id, created_at, updated_at
FROM new_values
)
ON CONFLICT (namespace_id, analyzer_type)
DO UPDATE SET
success = EXCLUDED.success,
failure = EXCLUDED.failure,
updated_at = EXCLUDED.updated_at,
traversal_ids = EXCLUDED.traversal_ids
RETURNING namespace_id
), diff_values AS (
SELECT
new_values.namespace_id AS namespace_id,
new_values.traversal_ids,
new_values.analyzer_type,
new_values.success - COALESCE(old_values.success, 0) AS success,
new_values.failure - COALESCE(old_values.failure, 0) AS failure
FROM new_values
LEFT JOIN old_values
ON new_values.namespace_id = old_values.namespace_id AND
new_values.analyzer_type = old_values.analyzer_type
WHERE EXISTS (
SELECT 1
FROM upserted
WHERE upserted.namespace_id = new_values.namespace_id
)
)
SELECT *
FROM diff_values
WHERE
success != 0 OR
failure != 0;
Plan
See full plan here.
Hash Semi Join (cost=661090.08..661145.14 rows=14 width=54) (actual time=0.053..0.055 rows=0 loops=1)
Hash Cond: (new_values.namespace_id = upserted.namespace_id)
Buffers: shared hit=12
I/O Timings: read=0.000 write=0.000
CTE new_values
-> HashAggregate (cost=661088.25..661088.84 rows=14 width=70) (actual time=0.039..0.039 rows=0 loops=1)
Group Key: "*VALUES*".column2, "*VALUES*".column1, analyzer_project_statuses.analyzer_type
Filter: (count(analyzer_project_statuses.analyzer_type) > 0)
Rows Removed by Filter: 3
Buffers: shared hit=12
I/O Timings: read=0.000 write=0.000
-> Nested Loop Left Join (cost=0.56..538088.77 rows=4919979 width=40) (actual time=0.023..0.032 rows=3 loops=1)
Buffers: shared hit=12
I/O Timings: read=0.000 write=0.000
-> Values Scan on "*VALUES*" (cost=0.00..0.04 rows=3 width=68) (actual time=0.003..0.004 rows=3 loops=1)
I/O Timings: read=0.000 write=0.000
-> Index Scan using index_analyzer_project_statuses_traversal_ids on public.analyzer_project_statuses (cost=0.56..162962.98 rows=1639993 width=39) (actual time=0.008..0.008 rows=0 loops=3)
Index Cond: ((analyzer_project_statuses.traversal_ids >= "*VALUES*".column2) AND (analyzer_project_statuses.traversal_ids < "*VALUES*".column3))
Filter: (NOT analyzer_project_statuses.archived)
Rows Removed by Filter: 0
Buffers: shared hit=12
I/O Timings: read=0.000 write=0.000
CTE upserted
-> ModifyTable on public.analyzer_namespace_statuses analyzer_namespace_statuses_1 (cost=0.00..0.35 rows=14 width=82) (actual time=0.040..0.040 rows=0 loops=1)
Buffers: shared hit=12
I/O Timings: read=0.000 write=0.000
-> CTE Scan on new_values new_values_2 (cost=0.00..0.35 rows=14 width=82) (actual time=0.039..0.039 rows=0 loops=1)
Buffers: shared hit=12
I/O Timings: read=0.000 write=0.000
-> Nested Loop Left Join (cost=0.43..55.23 rows=14 width=70) (actual time=0.000..0.000 rows=0 loops=0)
Filter: (((new_values.success - COALESCE(analyzer_namespace_statuses.success, '0'::bigint)) <> 0) OR ((new_values.failure - COALESCE(analyzer_namespace_statuses.failure, '0'::bigint)) <> 0))
Rows Removed by Filter: 0
I/O Timings: read=0.000 write=0.000
-> CTE Scan on new_values (cost=0.00..0.28 rows=14 width=54) (actual time=0.000..0.000 rows=0 loops=0)
I/O Timings: read=0.000 write=0.000
-> Nested Loop Semi Join (cost=0.43..3.90 rows=1 width=26) (actual time=0.000..0.000 rows=0 loops=0)
I/O Timings: read=0.000 write=0.000
-> Index Scan using index_analyzer_namespace_statuses_status on public.analyzer_namespace_statuses (cost=0.43..3.45 rows=1 width=26) (actual time=0.000..0.000 rows=0 loops=0)
Index Cond: ((analyzer_namespace_statuses.namespace_id = new_values.namespace_id) AND (analyzer_namespace_statuses.analyzer_type = new_values.analyzer_type))
I/O Timings: read=0.000 write=0.000
-> CTE Scan on new_values new_values_1 (cost=0.00..0.28 rows=14 width=4) (actual time=0.000..0.000 rows=0 loops=0)
I/O Timings: read=0.000 write=0.000
-> Hash (cost=0.28..0.28 rows=14 width=8) (actual time=0.040..0.041 rows=0 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 8kB
Buffers: shared hit=12
I/O Timings: read=0.000 write=0.000
-> CTE Scan on upserted (cost=0.00..0.28 rows=14 width=8) (actual time=0.040..0.040 rows=0 loops=1)
Buffers: shared hit=12
I/O Timings: read=0.000 write=0.000
Settings: seq_page_cost = '4', work_mem = '100MB', random_page_cost = '1.5', effective_cache_size = '338688MB', jit = 'off'
After
Raw SQL
WITH new_values AS (
WITH namespace_data (namespace_id, traversal_ids, next_traversal_id) AS (
VALUES (23, ARRAY[23]::bigint[], ARRAY[24]::bigint[]), (24, ARRAY[23, 24]::bigint[], ARRAY[23, 25]::bigint[]), (25, ARRAY[23, 24, 25]::bigint[], ARRAY[23, 24, 26]::bigint[])
)
SELECT
analyzer_project_statuses.analyzer_type,
COALESCE(SUM((status = 1)::int), 0) AS success,
COALESCE(SUM((status = 2)::int), 0) AS failure,
namespace_data.traversal_ids as traversal_ids,
namespace_data.namespace_id as namespace_id,
now() AS created_at,
now() AS updated_at
FROM namespace_data
LEFT JOIN analyzer_project_statuses
ON analyzer_project_statuses.archived = FALSE
AND analyzer_project_statuses.traversal_ids >= namespace_data.traversal_ids
AND analyzer_project_statuses.traversal_ids < namespace_data.next_traversal_id
GROUP BY namespace_data.traversal_ids, namespace_id, analyzer_project_statuses.analyzer_type
HAVING count(analyzer_project_statuses.analyzer_type) > 0
UNION ALL
-- Handle case where all projects are archived: return 0 counts for existing analyzer types
SELECT DISTINCT
analyzer_namespace_statuses.analyzer_type,
0 AS success,
0 AS failure,
namespace_data.traversal_ids,
namespace_data.namespace_id,
now() AS created_at,
now() AS updated_at
FROM namespace_data
CROSS JOIN analyzer_namespace_statuses
WHERE analyzer_namespace_statuses.namespace_id = namespace_data.namespace_id
AND NOT EXISTS (
SELECT 1
FROM analyzer_project_statuses
WHERE analyzer_project_statuses.archived = FALSE
AND analyzer_project_statuses.analyzer_type = analyzer_namespace_statuses.analyzer_type
AND analyzer_project_statuses.traversal_ids >= namespace_data.traversal_ids
AND analyzer_project_statuses.traversal_ids < namespace_data.next_traversal_id
)
), old_values AS (
SELECT
namespace_id,
traversal_ids,
analyzer_type,
success,
failure
FROM analyzer_namespace_statuses
WHERE namespace_id IN (SELECT namespace_id FROM new_values)
), upserted AS (
INSERT INTO analyzer_namespace_statuses
(analyzer_type, success, failure, traversal_ids, namespace_id, created_at, updated_at)
(SELECT analyzer_type, success, failure, traversal_ids, namespace_id, created_at, updated_at
FROM new_values
)
ON CONFLICT (namespace_id, analyzer_type)
DO UPDATE SET
success = EXCLUDED.success,
failure = EXCLUDED.failure,
updated_at = EXCLUDED.updated_at,
traversal_ids = EXCLUDED.traversal_ids
RETURNING namespace_id
), diff_values AS (
SELECT
new_values.namespace_id AS namespace_id,
new_values.traversal_ids,
new_values.analyzer_type,
new_values.success - COALESCE(old_values.success, 0) AS success,
new_values.failure - COALESCE(old_values.failure, 0) AS failure
FROM new_values
LEFT JOIN old_values
ON new_values.namespace_id = old_values.namespace_id AND
new_values.analyzer_type = old_values.analyzer_type
WHERE EXISTS (
SELECT 1
FROM upserted
WHERE upserted.namespace_id = new_values.namespace_id
)
)
SELECT *
FROM diff_values
WHERE
success != 0 OR
failure != 0;
Plan
See full plan here.
Hash Semi Join (cost=2336074.36..2336187.13 rows=26 width=54) (actual time=1.946..1.951 rows=0 loops=1)
Hash Cond: (new_values.namespace_id = upserted.namespace_id)
Buffers: shared hit=28 read=7
I/O Timings: read=1.835 write=0.000
CTE new_values
-> Append (cost=661088.31..2336072.43 rows=26 width=70) (actual time=1.945..1.947 rows=0 loops=1)
Buffers: shared hit=28 read=7
I/O Timings: read=1.835 write=0.000
CTE namespace_data
-> Values Scan on "*VALUES*" (cost=0.00..0.04 rows=3 width=68) (actual time=0.003..0.005 rows=3 loops=1)
I/O Timings: read=0.000 write=0.000
-> HashAggregate (cost=661088.27..661088.87 rows=14 width=70) (actual time=1.785..1.786 rows=0 loops=1)
Group Key: namespace_data.traversal_ids, namespace_data.namespace_id, analyzer_project_statuses.analyzer_type
Filter: (count(analyzer_project_statuses.analyzer_type) > 0)
Rows Removed by Filter: 3
Buffers: shared hit=8 read=4
I/O Timings: read=1.733 write=0.000
-> Nested Loop Left Join (cost=0.56..538088.80 rows=4919979 width=40) (actual time=1.763..1.776 rows=3 loops=1)
Buffers: shared hit=8 read=4
I/O Timings: read=1.733 write=0.000
-> CTE Scan on namespace_data (cost=0.00..0.06 rows=3 width=68) (actual time=0.004..0.008 rows=3 loops=1)
I/O Timings: read=0.000 write=0.000
-> Index Scan using index_analyzer_project_statuses_traversal_ids on public.analyzer_project_statuses (cost=0.56..162962.98 rows=1639993 width=39) (actual time=0.588..0.588 rows=0 loops=3)
Index Cond: ((analyzer_project_statuses.traversal_ids >= namespace_data.traversal_ids) AND (analyzer_project_statuses.traversal_ids < namespace_data.next_traversal_id))
Filter: (NOT analyzer_project_statuses.archived)
Rows Removed by Filter: 0
Buffers: shared hit=8 read=4
I/O Timings: read=1.733 write=0.000
-> Subquery Scan on *SELECT* 2 (cost=1674983.10..1674983.40 rows=12 width=70) (actual time=0.158..0.160 rows=0 loops=1)
Buffers: shared hit=20 read=3
I/O Timings: read=0.102 write=0.000
-> Unique (cost=1674983.10..1674983.22 rows=12 width=62) (actual time=0.158..0.159 rows=0 loops=1)
Buffers: shared hit=20 read=3
I/O Timings: read=0.102 write=0.000
-> Sort (cost=1674983.10..1674983.13 rows=12 width=62) (actual time=0.157..0.158 rows=0 loops=1)
Sort Key: analyzer_namespace_statuses_1.analyzer_type, namespace_data_1.traversal_ids, namespace_data_1.namespace_id
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=20 read=3
I/O Timings: read=0.102 write=0.000
-> Nested Loop Anti Join (cost=0.86..1674982.89 rows=12 width=62) (actual time=0.124..0.125 rows=0 loops=1)
Buffers: shared hit=9 read=3
I/O Timings: read=0.102 write=0.000
-> Nested Loop (cost=0.43..8.90 rows=13 width=70) (actual time=0.124..0.124 rows=0 loops=1)
Buffers: shared hit=9 read=3
I/O Timings: read=0.102 write=0.000
-> CTE Scan on namespace_data namespace_data_1 (cost=0.00..0.06 rows=3 width=68) (actual time=0.000..0.001 rows=3 loops=1)
I/O Timings: read=0.000 write=0.000
-> Index Only Scan using index_analyzer_namespace_statuses_status on public.analyzer_namespace_statuses analyzer_namespace_statuses_1 (cost=0.43..2.91 rows=4 width=10) (actual time=0.040..0.040 rows=0 loops=3)
Index Cond: (analyzer_namespace_statuses_1.namespace_id = namespace_data_1.namespace_id)
Heap Fetches: 0
Buffers: shared hit=9 read=3
I/O Timings: read=0.102 write=0.000
-> Index Scan using index_analyzer_project_statuses_status on public.analyzer_project_statuses analyzer_project_statuses_1 (cost=0.43..123766.90 rows=1054281 width=37) (actual time=0.000..0.000 rows=0 loops=0)
Index Cond: (analyzer_project_statuses_1.analyzer_type = analyzer_namespace_statuses_1.analyzer_type)
Filter: (NOT analyzer_project_statuses_1.archived)
Rows Removed by Filter: 0
I/O Timings: read=0.000 write=0.000
CTE upserted
-> ModifyTable on public.analyzer_namespace_statuses analyzer_namespace_statuses_2 (cost=0.00..0.65 rows=26 width=82) (actual time=0.001..0.001 rows=0 loops=1)
I/O Timings: read=0.000 write=0.000
-> CTE Scan on new_values new_values_2 (cost=0.00..0.65 rows=26 width=82) (actual time=0.000..0.000 rows=0 loops=1)
I/O Timings: read=0.000 write=0.000
-> Nested Loop Left Join (cost=0.43..112.71 rows=26 width=70) (actual time=1.946..1.946 rows=0 loops=1)
Filter: (((new_values.success - COALESCE(analyzer_namespace_statuses.success, '0'::bigint)) <> 0) OR ((new_values.failure - COALESCE(analyzer_namespace_statuses.failure, '0'::bigint)) <> 0))
Rows Removed by Filter: 0
Buffers: shared hit=28 read=7
I/O Timings: read=1.835 write=0.000
-> CTE Scan on new_values (cost=0.00..0.52 rows=26 width=54) (actual time=1.945..1.946 rows=0 loops=1)
Buffers: shared hit=28 read=7
I/O Timings: read=1.835 write=0.000
-> Nested Loop Semi Join (cost=0.43..4.29 rows=1 width=26) (actual time=0.000..0.000 rows=0 loops=0)
I/O Timings: read=0.000 write=0.000
-> Index Scan using index_analyzer_namespace_statuses_status on public.analyzer_namespace_statuses (cost=0.43..3.45 rows=1 width=26) (actual time=0.000..0.000 rows=0 loops=0)
Index Cond: ((analyzer_namespace_statuses.namespace_id = new_values.namespace_id) AND (analyzer_namespace_statuses.analyzer_type = new_values.analyzer_type))
I/O Timings: read=0.000 write=0.000
-> CTE Scan on new_values new_values_1 (cost=0.00..0.52 rows=26 width=4) (actual time=0.000..0.000 rows=0 loops=0)
I/O Timings: read=0.000 write=0.000
-> Hash (cost=0.52..0.52 rows=26 width=8) (actual time=0.000..0.000 rows=0 loops=0)
I/O Timings: read=0.000 write=0.000
-> CTE Scan on upserted (cost=0.00..0.52 rows=26 width=8) (actual time=0.000..0.000 rows=0 loops=0)
I/O Timings: read=0.000 write=0.000
Settings: random_page_cost = '1.5', effective_cache_size = '338688MB', jit = 'off', seq_page_cost = '4', work_mem = '100MB'
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.