Skip to content

Add index to vulnerability_reads on severity

What does this MR do and why?

Describe in detail what your merge request does and why.

Fixes: #377484 (closed)

When querying for vulnerability_reads, we perform sorting on vulnerability_id and severity, using the following query:

SELECT 
  "vulnerability_reads".* 
FROM 
  unnest(
    ARRAY(
      SELECT 
        "namespaces"."id" 
      FROM 
        "namespaces" 
      WHERE 
        "namespaces"."type" = 'Group' 
        AND (
          traversal_ids @> ('{9970}' :: integer[])
        )
    ) :: bigint[]
  ) AS "namespace_ids"("namespace_id"), 
  unnest(
    '{6,2,5,3,1,0,4,99}' :: smallint[]
  ) AS "report_types"("report_type"), 
  unnest('{1,4}' :: smallint[]) AS "states"("state"), 
  LATERAL (
    SELECT 
      "vulnerability_reads".* 
    FROM 
      "vulnerability_reads" 
    WHERE 
      (
        vulnerability_reads."namespace_id" = "namespace_ids"."namespace_id"
      ) 
      AND (
        vulnerability_reads."report_type" = "report_types"."report_type"
      ) 
      AND (
        vulnerability_reads."state" = "states"."state"
      ) 
    ORDER BY 
      "vulnerability_reads"."severity" DESC, 
      "vulnerability_reads"."vulnerability_id" DESC 
    LIMIT 
      101
  ) AS vulnerability_reads 
ORDER BY 
  "vulnerability_reads"."severity" DESC, 
  "vulnerability_reads"."vulnerability_id" DESC 
LIMIT 
  101;

vulnerability_id is indexed, but severity is not. When performing sorting on a group with 250k+ vulnerabilities, it takes 1.12 minutes without the severity index, and less than 500 ms with the index. This is because Postgres must perform sorting first, and then apply the limit. If the limit were to be applied first, then there may be records further down in the table which should be ordered before the ones that were in the first page.

This MR adds and index on severity so that the query is faster and will not timeout. Since we already have an index on vulnerability_id, having a single-column index uses less storage space than a composite index on vulnerability_id and severity. Additionally, testing in Postgres.ai shows that the two single-column indices perform better than a composite index.

Explain Plans

Before (without severity index): https://console.postgres.ai/shared/9ca82f55-6a09-4b6a-9215-16cf7e8019ce

 Limit  (cost=35120.20..35120.46 rows=101 width=160) (actual time=67610.754..67610.780 rows=101 loops=1)
   Buffers: shared hit=60744 read=57250 dirtied=5794
   I/O Timings: read=66165.579 write=0.000
   InitPlan 1 (returns $0)
     ->  Bitmap Heap Scan on public.namespaces  (cost=343.68..654.09 rows=200 width=4) (actual time=96.162..480.446 rows=410 loops=1)
           Buffers: shared hit=1 read=464 dirtied=6
           I/O Timings: read=469.452 write=0.000
           ->  Bitmap Index Scan using index_namespaces_on_traversal_ids_for_groups  (cost=0.00..343.63 rows=200 width=0) (actual time=93.303..93.304 rows=414 loops=1)
                 Index Cond: (namespaces.traversal_ids @> '{9970}'::integer[])
                 Buffers: shared hit=1 read=226
                 I/O Timings: read=85.479 write=0.000
   ->  Sort  (cost=34466.11..34506.51 rows=16160 width=160) (actual time=67610.752..67610.763 rows=101 loops=1)
         Sort Key: vulnerability_reads.severity DESC, vulnerability_reads.vulnerability_id DESC
         Sort Method: top-N heapsort  Memory: 49kB
         Buffers: shared hit=60744 read=57250 dirtied=5794
         I/O Timings: read=66165.579 write=0.000
         ->  Nested Loop  (cost=207.28..33847.33 rows=16160 width=160) (actual time=590.861..67605.587 rows=10562 loops=1)
               Buffers: shared hit=60738 read=57250 dirtied=5794
               I/O Timings: read=66165.579 write=0.000
               ->  Function Scan on unnest states  (cost=0.00..0.02 rows=2 width=2) (actual time=0.010..0.011 rows=2 loops=1)
                     I/O Timings: read=0.000 write=0.000
               ->  Nested Loop  (cost=207.27..16842.85 rows=8080 width=160) (actual time=296.966..33801.159 rows=5281 loops=2)
                     Buffers: shared hit=60738 read=57250 dirtied=5794
                     I/O Timings: read=66165.579 write=0.000
                     ->  Function Scan on unnest namespace_ids  (cost=0.03..0.13 rows=10 width=8) (actual time=240.409..240.540 rows=410 loops=2)
                           Buffers: shared hit=1 read=464 dirtied=6
                           I/O Timings: read=469.452 write=0.000
                     ->  Nested Loop  (cost=207.24..1676.19 rows=808 width=160) (actual time=33.461..81.850 rows=13 loops=820)
                           Buffers: shared hit=60737 read=56786 dirtied=5788
                           I/O Timings: read=65696.127 write=0.000
                           ->  Function Scan on unnest report_types  (cost=0.00..0.08 rows=8 width=2) (actual time=0.000..0.002 rows=8 loops=820)
                                 I/O Timings: read=0.000 write=0.000
                           ->  Limit  (cost=207.24..207.49 rows=101 width=160) (actual time=10.229..10.230 rows=2 loops=6560)
                                 Buffers: shared hit=60737 read=56786 dirtied=5788
                                 I/O Timings: read=65696.127 write=0.000
                                 ->  Sort  (cost=207.24..207.59 rows=139 width=160) (actual time=10.229..10.229 rows=2 loops=6560)
                                       Sort Key: vulnerability_reads.severity DESC, vulnerability_reads.vulnerability_id DESC
                                       Sort Method: quicksort  Memory: 25kB
                                       Buffers: shared hit=60737 read=56786 dirtied=5788
                                       I/O Timings: read=65696.127 write=0.000
                                       ->  Index Scan using index_vulnerability_reads_common_finder_query_with_namespace_id on public.vulnerability_reads  (cost=0.56..202.29 rows=139 width=160) (actual time=0.219..10.178 rows=56 loops=6560)
                                             Index Cond: ((vulnerability_reads.namespace_id = namespace_ids.namespace_id) AND (vulnerability_reads.state = states.state) AND (vulnerability_reads.report_type = report_types.report_type))
                                             Buffers: shared hit=60737 read=56786 dirtied=5788
                                             I/O Timings: read=65696.127 write=0.000

Time: 1.127 min
  - planning: 5.139 ms
  - execution: 1.127 min
    - I/O read: 1.103 min
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 60744 (~474.60 MiB) from the buffer pool
  - reads: 57250 (~447.30 MiB) from the OS file cache, including disk I/O
  - dirtied: 5794 (~45.30 MiB)
  - writes: 0

After (with severity index): https://console.postgres.ai/shared/b20c1064-f8f9-4be4-8c6d-750063241182

 Limit  (cost=35120.20..35120.46 rows=101 width=160) (actual time=429.307..429.331 rows=101 loops=1)
   Buffers: shared hit=117688
   I/O Timings: read=0.000 write=0.000
   InitPlan 1 (returns $0)
     ->  Bitmap Heap Scan on public.namespaces  (cost=343.68..654.09 rows=200 width=4) (actual time=9.728..10.612 rows=410 loops=1)
           Buffers: shared hit=465
           I/O Timings: read=0.000 write=0.000
           ->  Bitmap Index Scan using index_namespaces_on_traversal_ids_for_groups  (cost=0.00..343.63 rows=200 width=0) (actual time=9.647..9.648 rows=414 loops=1)
                 Index Cond: (namespaces.traversal_ids @> '{9970}'::integer[])
                 Buffers: shared hit=227
                 I/O Timings: read=0.000 write=0.000
   ->  Sort  (cost=34466.11..34506.51 rows=16160 width=160) (actual time=429.305..429.315 rows=101 loops=1)
         Sort Key: vulnerability_reads.severity DESC, vulnerability_reads.vulnerability_id DESC
         Sort Method: top-N heapsort  Memory: 49kB
         Buffers: shared hit=117688
         I/O Timings: read=0.000 write=0.000
         ->  Nested Loop  (cost=207.28..33847.33 rows=16160 width=160) (actual time=11.145..424.021 rows=10562 loops=1)
               Buffers: shared hit=117682
               I/O Timings: read=0.000 write=0.000
               ->  Function Scan on unnest states  (cost=0.00..0.02 rows=2 width=2) (actual time=0.022..0.024 rows=2 loops=1)
                     I/O Timings: read=0.000 write=0.000
               ->  Nested Loop  (cost=207.27..16842.85 rows=8080 width=160) (actual time=5.623..210.609 rows=5281 loops=2)
                     Buffers: shared hit=117682
                     I/O Timings: read=0.000 write=0.000
                     ->  Function Scan on unnest namespace_ids  (cost=0.03..0.13 rows=10 width=8) (actual time=5.408..5.471 rows=410 loops=2)
                           Buffers: shared hit=465
                           I/O Timings: read=0.000 write=0.000
                     ->  Nested Loop  (cost=207.24..1676.19 rows=808 width=160) (actual time=0.208..0.497 rows=13 loops=820)
                           Buffers: shared hit=117217
                           I/O Timings: read=0.000 write=0.000
                           ->  Function Scan on unnest report_types  (cost=0.00..0.08 rows=8 width=2) (actual time=0.000..0.001 rows=8 loops=820)
                                 I/O Timings: read=0.000 write=0.000
                           ->  Limit  (cost=207.24..207.49 rows=101 width=160) (actual time=0.060..0.061 rows=2 loops=6560)
                                 Buffers: shared hit=117217
                                 I/O Timings: read=0.000 write=0.000
                                 ->  Sort  (cost=207.24..207.59 rows=139 width=160) (actual time=0.060..0.060 rows=2 loops=6560)
                                       Sort Key: vulnerability_reads.severity DESC, vulnerability_reads.vulnerability_id DESC
                                       Sort Method: quicksort  Memory: 25kB
                                       Buffers: shared hit=117217
                                       I/O Timings: read=0.000 write=0.000
                                       ->  Index Scan using index_vulnerability_reads_common_finder_query_with_namespace_id on public.vulnerability_reads  (cost=0.56..202.29 rows=139 width=160) (actual time=0.003..0.042 rows=56 loops=6560)
                                             Index Cond: ((vulnerability_reads.namespace_id = namespace_ids.namespace_id) AND (vulnerability_reads.state = states.state) AND (vulnerability_reads.report_type = report_types.report_type))
                                             Buffers: shared hit=117217
                                             I/O Timings: read=0.000 write=0.000

Time: 436.132 ms
  - planning: 6.592 ms
  - execution: 429.540 ms
    - I/O read: 0.000 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 117688 (~919.40 MiB) from the buffer pool
  - reads: 0 from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Brian Williams

Merge request reports