Skip to content

Fix Tool Coverage data update when all projects are archived

What does this MR do and why?

This MR fixes a bug where Tool Coverage statistics were not properly updated when archiving the last project in a namespace/subgroup. The issue occurred specifically when all projects in a namespace became archived, leaving the analyzer namespace statistics with stale data instead of being reset to zero.

Problem Description

When projects are archived, the Security::AnalyzerNamespaceStatuses::AdjustmentService is responsible for recalculating namespace-level analyzer statistics by aggregating data from unarchived projects. However, there was a critical gap in the SQL query logic:

Before the fix:

  • The main query used a LEFT JOIN with analyzer_project_statuses filtered by archived = FALSE
  • When all projects were archived, this join returned no rows
  • The HAVING count(analyzer_project_statuses.analyzer_type) > 0 clause filtered out empty results
  • This meant existing analyzer_namespace_statuses records were never updated when all projects became archived
  • The namespace statistics remained at their previous values instead of being reset to zero

Solution

The fix adds a UNION ALL clause to handle the edge case where all projects are archived:

UNION ALL

-- Handle case where all projects are archived: return 0 counts for existing analyzer types
SELECT DISTINCT
  analyzer_namespace_statuses.analyzer_type,
  0 AS success,
  0 AS failure,
  namespace_data.traversal_ids,
  namespace_data.namespace_id,
  now() AS created_at,
  now() AS updated_at
FROM namespace_data
CROSS JOIN analyzer_namespace_statuses
WHERE analyzer_namespace_statuses.namespace_id = namespace_data.namespace_id
AND NOT EXISTS (
  SELECT 1
  FROM analyzer_project_statuses
  WHERE analyzer_project_statuses.archived = FALSE
  AND analyzer_project_statuses.analyzer_type = analyzer_namespace_statuses.analyzer_type
  AND analyzer_project_statuses.traversal_ids >= namespace_data.traversal_ids
  AND analyzer_project_statuses.traversal_ids < namespace_data.next_traversal_id
)

Changelog: fixed EE: true

References

#572794

How to set up and validate locally

Before the fix:

  1. Create a subgroup
  2. Import first project
  3. Import second project
  4. Run pipeline for first project → "Tool Coverage" stats update correctly
  5. Run pipeline for second project → "Tool Coverage" stats update correctly
  6. Archive the first project → "Tool Coverage" stats update correctly
  7. Archive the second (last) project → "Tool Coverage" stats remain unchanged

After the fix:

  1. Create a subgroup
  2. Import first project
  3. Import second project
  4. Run pipeline for first project → "Tool Coverage" stats update correctly
  5. Run pipeline for second project → "Tool Coverage" stats update correctly
  6. Archive the first project → "Tool Coverage" stats update correctly
  7. Archive the second (last) project → "Tool Coverage" stats are updated and the counters will be 0 (zero)

Query plans

Before

Raw SQL
WITH new_values AS (
  WITH namespace_data (namespace_id, traversal_ids, next_traversal_id) AS (
    VALUES (23, ARRAY[23]::bigint[], ARRAY[24]::bigint[]), (24, ARRAY[23, 24]::bigint[], ARRAY[23, 25]::bigint[]), (25, ARRAY[23, 24, 25]::bigint[], ARRAY[23, 24, 26]::bigint[])
)
SELECT
  analyzer_project_statuses.analyzer_type,
  COALESCE(SUM((status = 1)::int), 0) AS success,
  COALESCE(SUM((status = 2)::int), 0) AS failure,
  namespace_data.traversal_ids as traversal_ids,
  namespace_data.namespace_id as namespace_id,
  now() AS created_at,
  now() AS updated_at
FROM namespace_data
LEFT JOIN analyzer_project_statuses
  ON analyzer_project_statuses.archived = FALSE
  AND analyzer_project_statuses.traversal_ids >= namespace_data.traversal_ids
  AND analyzer_project_statuses.traversal_ids < namespace_data.next_traversal_id
GROUP BY namespace_data.traversal_ids, namespace_id, analyzer_project_statuses.analyzer_type
HAVING count(analyzer_project_statuses.analyzer_type) > 0
), old_values AS (
  SELECT
  namespace_id,
  traversal_ids,
  analyzer_type,
  success,
  failure
FROM analyzer_namespace_statuses
WHERE namespace_id IN (SELECT namespace_id FROM new_values)

), upserted AS (
  INSERT INTO analyzer_namespace_statuses
  (analyzer_type, success, failure, traversal_ids, namespace_id, created_at, updated_at)
  (SELECT analyzer_type, success, failure, traversal_ids, namespace_id, created_at, updated_at
FROM new_values
)
ON CONFLICT (namespace_id, analyzer_type)
DO UPDATE SET
  success = EXCLUDED.success,
  failure = EXCLUDED.failure,
  updated_at = EXCLUDED.updated_at,
  traversal_ids = EXCLUDED.traversal_ids
RETURNING namespace_id

), diff_values AS (
  SELECT
  new_values.namespace_id AS namespace_id,
  new_values.traversal_ids,
  new_values.analyzer_type,
  new_values.success - COALESCE(old_values.success, 0) AS success,
  new_values.failure - COALESCE(old_values.failure, 0) AS failure
FROM new_values
LEFT JOIN old_values
  ON new_values.namespace_id = old_values.namespace_id AND
     new_values.analyzer_type = old_values.analyzer_type
WHERE EXISTS (
  SELECT 1
  FROM upserted
  WHERE upserted.namespace_id = new_values.namespace_id
)

)
SELECT *
FROM diff_values
WHERE
  success != 0 OR
  failure != 0;
Plan

See full plan here.

Hash Semi Join  (cost=661090.08..661145.14 rows=14 width=54) (actual time=0.053..0.055 rows=0 loops=1)
   Hash Cond: (new_values.namespace_id = upserted.namespace_id)
   Buffers: shared hit=12
   I/O Timings: read=0.000 write=0.000
   CTE new_values
     ->  HashAggregate  (cost=661088.25..661088.84 rows=14 width=70) (actual time=0.039..0.039 rows=0 loops=1)
           Group Key: "*VALUES*".column2, "*VALUES*".column1, analyzer_project_statuses.analyzer_type
           Filter: (count(analyzer_project_statuses.analyzer_type) > 0)
           Rows Removed by Filter: 3
           Buffers: shared hit=12
           I/O Timings: read=0.000 write=0.000
           ->  Nested Loop Left Join  (cost=0.56..538088.77 rows=4919979 width=40) (actual time=0.023..0.032 rows=3 loops=1)
                 Buffers: shared hit=12
                 I/O Timings: read=0.000 write=0.000
                 ->  Values Scan on "*VALUES*"  (cost=0.00..0.04 rows=3 width=68) (actual time=0.003..0.004 rows=3 loops=1)
                       I/O Timings: read=0.000 write=0.000
                 ->  Index Scan using index_analyzer_project_statuses_traversal_ids on public.analyzer_project_statuses  (cost=0.56..162962.98 rows=1639993 width=39) (actual time=0.008..0.008 rows=0 loops=3)
                       Index Cond: ((analyzer_project_statuses.traversal_ids >= "*VALUES*".column2) AND (analyzer_project_statuses.traversal_ids < "*VALUES*".column3))
                       Filter: (NOT analyzer_project_statuses.archived)
                       Rows Removed by Filter: 0
                       Buffers: shared hit=12
                       I/O Timings: read=0.000 write=0.000
   CTE upserted
     ->  ModifyTable on public.analyzer_namespace_statuses analyzer_namespace_statuses_1  (cost=0.00..0.35 rows=14 width=82) (actual time=0.040..0.040 rows=0 loops=1)
           Buffers: shared hit=12
           I/O Timings: read=0.000 write=0.000
           ->  CTE Scan on new_values new_values_2  (cost=0.00..0.35 rows=14 width=82) (actual time=0.039..0.039 rows=0 loops=1)
                 Buffers: shared hit=12
                 I/O Timings: read=0.000 write=0.000
   ->  Nested Loop Left Join  (cost=0.43..55.23 rows=14 width=70) (actual time=0.000..0.000 rows=0 loops=0)
         Filter: (((new_values.success - COALESCE(analyzer_namespace_statuses.success, '0'::bigint)) <> 0) OR ((new_values.failure - COALESCE(analyzer_namespace_statuses.failure, '0'::bigint)) <> 0))
         Rows Removed by Filter: 0
         I/O Timings: read=0.000 write=0.000
         ->  CTE Scan on new_values  (cost=0.00..0.28 rows=14 width=54) (actual time=0.000..0.000 rows=0 loops=0)
               I/O Timings: read=0.000 write=0.000
         ->  Nested Loop Semi Join  (cost=0.43..3.90 rows=1 width=26) (actual time=0.000..0.000 rows=0 loops=0)
               I/O Timings: read=0.000 write=0.000
               ->  Index Scan using index_analyzer_namespace_statuses_status on public.analyzer_namespace_statuses  (cost=0.43..3.45 rows=1 width=26) (actual time=0.000..0.000 rows=0 loops=0)
                     Index Cond: ((analyzer_namespace_statuses.namespace_id = new_values.namespace_id) AND (analyzer_namespace_statuses.analyzer_type = new_values.analyzer_type))
                     I/O Timings: read=0.000 write=0.000
               ->  CTE Scan on new_values new_values_1  (cost=0.00..0.28 rows=14 width=4) (actual time=0.000..0.000 rows=0 loops=0)
                     I/O Timings: read=0.000 write=0.000
   ->  Hash  (cost=0.28..0.28 rows=14 width=8) (actual time=0.040..0.041 rows=0 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 8kB
         Buffers: shared hit=12
         I/O Timings: read=0.000 write=0.000
         ->  CTE Scan on upserted  (cost=0.00..0.28 rows=14 width=8) (actual time=0.040..0.040 rows=0 loops=1)
               Buffers: shared hit=12
               I/O Timings: read=0.000 write=0.000
Settings: seq_page_cost = '4', work_mem = '100MB', random_page_cost = '1.5', effective_cache_size = '338688MB', jit = 'off'

After

Raw SQL
WITH new_values AS (
  WITH namespace_data (namespace_id, traversal_ids, next_traversal_id) AS (
    VALUES (23, ARRAY[23]::bigint[], ARRAY[24]::bigint[]), (24, ARRAY[23, 24]::bigint[], ARRAY[23, 25]::bigint[]), (25, ARRAY[23, 24, 25]::bigint[], ARRAY[23, 24, 26]::bigint[])
)
SELECT
  analyzer_project_statuses.analyzer_type,
  COALESCE(SUM((status = 1)::int), 0) AS success,
  COALESCE(SUM((status = 2)::int), 0) AS failure,
  namespace_data.traversal_ids as traversal_ids,
  namespace_data.namespace_id as namespace_id,
  now() AS created_at,
  now() AS updated_at
FROM namespace_data
LEFT JOIN analyzer_project_statuses
  ON analyzer_project_statuses.archived = FALSE
  AND analyzer_project_statuses.traversal_ids >= namespace_data.traversal_ids
  AND analyzer_project_statuses.traversal_ids < namespace_data.next_traversal_id
GROUP BY namespace_data.traversal_ids, namespace_id, analyzer_project_statuses.analyzer_type
HAVING count(analyzer_project_statuses.analyzer_type) > 0

UNION ALL

-- Handle case where all projects are archived: return 0 counts for existing analyzer types
SELECT DISTINCT
  analyzer_namespace_statuses.analyzer_type,
  0 AS success,
  0 AS failure,
  namespace_data.traversal_ids,
  namespace_data.namespace_id,
  now() AS created_at,
  now() AS updated_at
FROM namespace_data
CROSS JOIN analyzer_namespace_statuses
WHERE analyzer_namespace_statuses.namespace_id = namespace_data.namespace_id
AND NOT EXISTS (
  SELECT 1
  FROM analyzer_project_statuses
  WHERE analyzer_project_statuses.archived = FALSE
  AND analyzer_project_statuses.analyzer_type = analyzer_namespace_statuses.analyzer_type
  AND analyzer_project_statuses.traversal_ids >= namespace_data.traversal_ids
  AND analyzer_project_statuses.traversal_ids < namespace_data.next_traversal_id
)

), old_values AS (
  SELECT
  namespace_id,
  traversal_ids,
  analyzer_type,
  success,
  failure
FROM analyzer_namespace_statuses
WHERE namespace_id IN (SELECT namespace_id FROM new_values)

), upserted AS (
  INSERT INTO analyzer_namespace_statuses
  (analyzer_type, success, failure, traversal_ids, namespace_id, created_at, updated_at)
  (SELECT analyzer_type, success, failure, traversal_ids, namespace_id, created_at, updated_at
FROM new_values
)
ON CONFLICT (namespace_id, analyzer_type)
DO UPDATE SET
  success = EXCLUDED.success,
  failure = EXCLUDED.failure,
  updated_at = EXCLUDED.updated_at,
  traversal_ids = EXCLUDED.traversal_ids
RETURNING namespace_id

), diff_values AS (
  SELECT
  new_values.namespace_id AS namespace_id,
  new_values.traversal_ids,
  new_values.analyzer_type,
  new_values.success - COALESCE(old_values.success, 0) AS success,
  new_values.failure - COALESCE(old_values.failure, 0) AS failure
FROM new_values
LEFT JOIN old_values
  ON new_values.namespace_id = old_values.namespace_id AND
     new_values.analyzer_type = old_values.analyzer_type
WHERE EXISTS (
  SELECT 1
  FROM upserted
  WHERE upserted.namespace_id = new_values.namespace_id
)

)
SELECT *
FROM diff_values
WHERE
  success != 0 OR
  failure != 0;
Plan

See full plan here.

 Hash Semi Join  (cost=2336074.36..2336187.13 rows=26 width=54) (actual time=1.946..1.951 rows=0 loops=1)
   Hash Cond: (new_values.namespace_id = upserted.namespace_id)
   Buffers: shared hit=28 read=7
   I/O Timings: read=1.835 write=0.000
   CTE new_values
     ->  Append  (cost=661088.31..2336072.43 rows=26 width=70) (actual time=1.945..1.947 rows=0 loops=1)
           Buffers: shared hit=28 read=7
           I/O Timings: read=1.835 write=0.000
           CTE namespace_data
             ->  Values Scan on "*VALUES*"  (cost=0.00..0.04 rows=3 width=68) (actual time=0.003..0.005 rows=3 loops=1)
                   I/O Timings: read=0.000 write=0.000
           ->  HashAggregate  (cost=661088.27..661088.87 rows=14 width=70) (actual time=1.785..1.786 rows=0 loops=1)
                 Group Key: namespace_data.traversal_ids, namespace_data.namespace_id, analyzer_project_statuses.analyzer_type
                 Filter: (count(analyzer_project_statuses.analyzer_type) > 0)
                 Rows Removed by Filter: 3
                 Buffers: shared hit=8 read=4
                 I/O Timings: read=1.733 write=0.000
                 ->  Nested Loop Left Join  (cost=0.56..538088.80 rows=4919979 width=40) (actual time=1.763..1.776 rows=3 loops=1)
                       Buffers: shared hit=8 read=4
                       I/O Timings: read=1.733 write=0.000
                       ->  CTE Scan on namespace_data  (cost=0.00..0.06 rows=3 width=68) (actual time=0.004..0.008 rows=3 loops=1)
                             I/O Timings: read=0.000 write=0.000
                       ->  Index Scan using index_analyzer_project_statuses_traversal_ids on public.analyzer_project_statuses  (cost=0.56..162962.98 rows=1639993 width=39) (actual time=0.588..0.588 rows=0 loops=3)
                             Index Cond: ((analyzer_project_statuses.traversal_ids >= namespace_data.traversal_ids) AND (analyzer_project_statuses.traversal_ids < namespace_data.next_traversal_id))
                             Filter: (NOT analyzer_project_statuses.archived)
                             Rows Removed by Filter: 0
                             Buffers: shared hit=8 read=4
                             I/O Timings: read=1.733 write=0.000
           ->  Subquery Scan on *SELECT* 2  (cost=1674983.10..1674983.40 rows=12 width=70) (actual time=0.158..0.160 rows=0 loops=1)
                 Buffers: shared hit=20 read=3
                 I/O Timings: read=0.102 write=0.000
                 ->  Unique  (cost=1674983.10..1674983.22 rows=12 width=62) (actual time=0.158..0.159 rows=0 loops=1)
                       Buffers: shared hit=20 read=3
                       I/O Timings: read=0.102 write=0.000
                       ->  Sort  (cost=1674983.10..1674983.13 rows=12 width=62) (actual time=0.157..0.158 rows=0 loops=1)
                             Sort Key: analyzer_namespace_statuses_1.analyzer_type, namespace_data_1.traversal_ids, namespace_data_1.namespace_id
                             Sort Method: quicksort  Memory: 25kB
                             Buffers: shared hit=20 read=3
                             I/O Timings: read=0.102 write=0.000
                             ->  Nested Loop Anti Join  (cost=0.86..1674982.89 rows=12 width=62) (actual time=0.124..0.125 rows=0 loops=1)
                                   Buffers: shared hit=9 read=3
                                   I/O Timings: read=0.102 write=0.000
                                   ->  Nested Loop  (cost=0.43..8.90 rows=13 width=70) (actual time=0.124..0.124 rows=0 loops=1)
                                         Buffers: shared hit=9 read=3
                                         I/O Timings: read=0.102 write=0.000
                                         ->  CTE Scan on namespace_data namespace_data_1  (cost=0.00..0.06 rows=3 width=68) (actual time=0.000..0.001 rows=3 loops=1)
                                               I/O Timings: read=0.000 write=0.000
                                         ->  Index Only Scan using index_analyzer_namespace_statuses_status on public.analyzer_namespace_statuses analyzer_namespace_statuses_1  (cost=0.43..2.91 rows=4 width=10) (actual time=0.040..0.040 rows=0 loops=3)
                                               Index Cond: (analyzer_namespace_statuses_1.namespace_id = namespace_data_1.namespace_id)
                                               Heap Fetches: 0
                                               Buffers: shared hit=9 read=3
                                               I/O Timings: read=0.102 write=0.000
                                   ->  Index Scan using index_analyzer_project_statuses_status on public.analyzer_project_statuses analyzer_project_statuses_1  (cost=0.43..123766.90 rows=1054281 width=37) (actual time=0.000..0.000 rows=0 loops=0)
                                         Index Cond: (analyzer_project_statuses_1.analyzer_type = analyzer_namespace_statuses_1.analyzer_type)
                                         Filter: (NOT analyzer_project_statuses_1.archived)
                                         Rows Removed by Filter: 0
                                         I/O Timings: read=0.000 write=0.000
   CTE upserted
     ->  ModifyTable on public.analyzer_namespace_statuses analyzer_namespace_statuses_2  (cost=0.00..0.65 rows=26 width=82) (actual time=0.001..0.001 rows=0 loops=1)
           I/O Timings: read=0.000 write=0.000
           ->  CTE Scan on new_values new_values_2  (cost=0.00..0.65 rows=26 width=82) (actual time=0.000..0.000 rows=0 loops=1)
                 I/O Timings: read=0.000 write=0.000
   ->  Nested Loop Left Join  (cost=0.43..112.71 rows=26 width=70) (actual time=1.946..1.946 rows=0 loops=1)
         Filter: (((new_values.success - COALESCE(analyzer_namespace_statuses.success, '0'::bigint)) <> 0) OR ((new_values.failure - COALESCE(analyzer_namespace_statuses.failure, '0'::bigint)) <> 0))
         Rows Removed by Filter: 0
         Buffers: shared hit=28 read=7
         I/O Timings: read=1.835 write=0.000
         ->  CTE Scan on new_values  (cost=0.00..0.52 rows=26 width=54) (actual time=1.945..1.946 rows=0 loops=1)
               Buffers: shared hit=28 read=7
               I/O Timings: read=1.835 write=0.000
         ->  Nested Loop Semi Join  (cost=0.43..4.29 rows=1 width=26) (actual time=0.000..0.000 rows=0 loops=0)
               I/O Timings: read=0.000 write=0.000
               ->  Index Scan using index_analyzer_namespace_statuses_status on public.analyzer_namespace_statuses  (cost=0.43..3.45 rows=1 width=26) (actual time=0.000..0.000 rows=0 loops=0)
                     Index Cond: ((analyzer_namespace_statuses.namespace_id = new_values.namespace_id) AND (analyzer_namespace_statuses.analyzer_type = new_values.analyzer_type))
                     I/O Timings: read=0.000 write=0.000
               ->  CTE Scan on new_values new_values_1  (cost=0.00..0.52 rows=26 width=4) (actual time=0.000..0.000 rows=0 loops=0)
                     I/O Timings: read=0.000 write=0.000
   ->  Hash  (cost=0.52..0.52 rows=26 width=8) (actual time=0.000..0.000 rows=0 loops=0)
         I/O Timings: read=0.000 write=0.000
         ->  CTE Scan on upserted  (cost=0.00..0.52 rows=26 width=8) (actual time=0.000..0.000 rows=0 loops=0)
               I/O Timings: read=0.000 write=0.000
Settings: random_page_cost = '1.5', effective_cache_size = '338688MB', jit = 'off', seq_page_cost = '4', work_mem = '100MB'

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Nicolae Rotaru

Merge request reports

Loading