Skip to content

Optimise Vulnerability Auto-Resolving with In-Operator Optimisation

  • Please check this box if this contribution uses AI-generated content (including content generated by GitLab Duo features) as outlined in the GitLab DCO & CLA

What does this MR do and why?

Problem

The current implementation of the vulnerability_ids_for method uses a standard active record query, which can be computationally intensive

Solution

Use Batch Iterator of the InOperatorOptimization module

How to set up and validate locally

  1. import Security Reports repository
  2. Remove a vulnerability from one of the security reports and merge the change

DB

Before the change

Link to postgres.ai

Query

SELECT 
  *, 
  vulnerability_occurrences.vulnerability_id AS vulnerability_id 
FROM 
  (
    WITH recursive "array_cte" AS materialized (
      SELECT 
        "vulnerability_identifiers"."id" 
      FROM 
        "vulnerability_identifiers" 
      WHERE 
        "vulnerability_identifiers"."id" IN (
          391443761, 579797979, 4008437, 521825430, 
          1170120, 1556418, 411142512, 3444484, 
          370052924, 857033, 2332695, 391443762, 
          3444500, 997628, 1224924, 2316204, 
          504345345, 16781253, 3971429, 1737876, 
          6994063, 3444497, 1360628451, 4764, 
          45949332, 922077050, 120051246, 1212213299, 
          34172902, 370052925, 158639479, 3093338, 
          109987259, 45141142, 125204793, 51296492, 
          62216501, 563060, 1432291701, 3305200, 
          1403804268, 903711, 1224922, 5475553, 
          1469903, 1102751, 2239821, 776213981, 
          499539, 1867682, 1030292425, 60737826, 
          1098381, 117401895, 77338273, 579797980, 
          1377423776, 997620, 1693846, 19186429, 
          1212214255, 464943, 3305208, 155954740, 
          4626888, 3093341, 1158290344, 4766, 
          1170122, 1115905169, 136013002, 2316205, 
          158618668, 857030, 4008443, 4117916, 
          391443767, 25552728, 483798999, 2239810, 
          579797981, 3001697, 4008430, 85688157, 
          1158290346, 480228523, 1211583931, 
          195434428, 1186811285, 3589711, 1211575, 
          180760, 511462153, 2916, 2912, 4008446, 
          1451111, 320639382, 1451108, 4772
        )
    ), 
    "recursive_keyset_cte" AS (
      (
        SELECT 
          NULL :: vulnerability_identifiers AS records, 
          array_cte_id_array, 
          vulnerability_identifiers_id_array, 
          0 :: bigint AS count 
        FROM 
          (
            SELECT 
              array_agg("array_cte"."id") AS array_cte_id_array, 
              array_agg(
                "vulnerability_identifiers"."id"
              ) AS vulnerability_identifiers_id_array 
            FROM 
              (
                SELECT 
                  "array_cte"."id" 
                FROM 
                  array_cte
              ) array_cte 
              LEFT JOIN lateral (
                SELECT 
                  "vulnerability_identifiers"."id" AS id 
                FROM 
                  "vulnerability_identifiers" 
                WHERE 
                  "vulnerability_identifiers"."id" = "array_cte"."id" 
                ORDER BY 
                  "vulnerability_identifiers"."id" ASC 
                limit 
                  1
              ) vulnerability_identifiers ON true 
            WHERE 
              "vulnerability_identifiers"."id" IS NOT NULL
          ) array_scope_lateral_query 
        limit 
          1
      ) 
      UNION ALL 
        (
          SELECT 
            (
              SELECT 
                vulnerability_identifiers 
              FROM 
                "vulnerability_identifiers" 
              WHERE 
                "vulnerability_identifiers"."id" = recursive_keyset_cte.vulnerability_identifiers_id_array[position] 
              limit 
                1
            ), 
            array_cte_id_array, 
            recursive_keyset_cte.vulnerability_identifiers_id_array[ : position_query.position - 1] || next_cursor_values.id || recursive_keyset_cte.vulnerability_identifiers_id_array[position_query.position + 1 : ], 
            recursive_keyset_cte.count + 1 
          FROM 
            recursive_keyset_cte, 
            lateral (
              SELECT 
                id, 
                position 
              FROM 
                unnest(
                  vulnerability_identifiers_id_array
                ) WITH ordinality AS u(id, position) 
              WHERE 
                id IS NOT NULL 
              ORDER BY 
                1 ASC 
              limit 
                1
            ) AS position_query, 
            lateral (
              SELECT 
                "record"."id" 
              FROM 
                (
                  VALUES 
                    (NULL)
                ) AS nulls 
                LEFT JOIN (
                  SELECT 
                    "vulnerability_identifiers"."id" AS id 
                  FROM 
                    "vulnerability_identifiers" 
                  WHERE 
                    "vulnerability_identifiers"."id" = recursive_keyset_cte.array_cte_id_array[position] 
                    AND (
                      "vulnerability_identifiers"."id" > recursive_keyset_cte.vulnerability_identifiers_id_array[position]
                    ) 
                  ORDER BY 
                    "vulnerability_identifiers"."id" ASC 
                  limit 
                    1
                ) record ON true 
              limit 
                1
            ) AS next_cursor_values
        )
    ) 
    SELECT 
      (records).* 
    FROM 
      "recursive_keyset_cte" AS "vulnerability_identifiers" 
    WHERE 
      (count <> 0)
  ) vulnerability_identifiers 
  INNER JOIN "vulnerability_occurrences" ON "vulnerability_occurrences"."primary_identifier_id" = "vulnerability_identifiers"."id" 
limit 
  25

Plan

Nested Loop  (cost=1.00..1999.89 rows=659 width=16)
  ->  Index Only Scan using vulnerability_identifiers_pkey on vulnerability_identifiers  (cost=0.43..210.12 rows=100 width=8)
        Index Cond: (id = ANY ('{391443761,579797979,4008437,521825430,1170120,1556418,411142512,3444484,370052924,857033,2332695,391443762,3444500,997628,1224924,2316204,504345345,16781253,3971429,1737876,6994063,3444497,1360628451,4764,45949332,922077050,120051246,1212213299,34172902,370052925,158639479,3093338,109987259,45141142,125204793,51296492,62216501,563060,1432291701,3305200,1403804268,903711,1224922,5475553,1469903,1102751,2239821,776213981,499539,1867682,1030292425,60737826,1098381,117401895,77338273,579797980,1377423776,997620,1693846,19186429,1212214255,464943,3305208,155954740,4626888,3093341,1158290344,4766,1170122,1115905169,136013002,2316205,158618668,857030,4008443,4117916,391443767,25552728,483798999,2239810,579797981,3001697,4008430,85688157,1158290346,480228523,1211583931,195434428,1186811285,3589711,1211575,180760,511462153,2916,2912,4008446,1451111,320639382,1451108,4772}'::bigint[]))
  ->  Index Only Scan using index_vulnerability_occurrences_prim_iden_id_and_vuln_id on vulnerability_occurrences  (cost=0.57..13.09 rows=481 width=16)
        Index Cond: (primary_identifier_id = vulnerability_identifiers.id)

Statistics

Time: 846.881 ms  
  - planning: 5.034 ms  
  - execution: 841.847 ms  
    - I/O read: 825.560 ms  
    - I/O write: 0.000 ms  
  
Shared buffers:  
  - hits: 743 (~5.80 MiB) from the buffer pool  
  - reads: 658 (~5.10 MiB) from the OS file cache, including disk I/O  
  - dirtied: 61 (~488.00 KiB)  
  - writes: 0 

After the Change

Query

After trying out some optimization options we chose the simple approach i.e.: iterating over identifiers one by one and processing the vulnerabilities by running two separate queries for each identifier as follows
Using short requests will ensure no timeouts

      identifier_ids.each do |identifier_id|
        scope = Vulnerabilities::Finding.where(primary_identifier_id: identifier_id).order(order)

        Gitlab::Pagination::Keyset::Iterator.new(scope: scope)
                                            .each_batch(of: VULNERABILITY_IDENTIFIERS_BATCH_SIZE) do |records|
          vulnerability_ids = records.map(&:vulnerability_id)

          yield ::Vulnerability.id_in(vulnerability_ids).with_states(:detected).with_resolution(true)
        end
      end

SQL query for vulnerability_ids = records.map(&:vulnerability_id) Link to postgres.ai

SELECT 
  "vulnerability_occurrences".* 
FROM 
  "vulnerability_occurrences" 
WHERE 
  "vulnerability_occurrences"."primary_identifier_id" = 1212213299 
ORDER BY 
  "vulnerability_occurrences"."vulnerability_id" ASC 
LIMIT 
  250

Plan

 Limit  (cost=0.57..355.96 rows=250 width=1426) (actual time=96.821..96.823 rows=0 loops=1)
   Buffers: shared hit=4 read=3
   I/O Timings: read=96.663 write=0.000
   ->  Index Scan using index_vulnerability_occurrences_prim_iden_id_and_vuln_id on public.vulnerability_occurrences  (cost=0.57..500.96 rows=352 width=1426) (actual time=96.818..96.819 rows=0 loops=1)
         Index Cond: (vulnerability_occurrences.primary_identifier_id = 1212213299)
         Buffers: shared hit=4 read=3
         I/O Timings: read=96.663 write=0.000

Statistics

Time: 194.719 ms
  - planning: 97.764 ms
  - execution: 96.955 ms
    - I/O read: 96.663 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 4 (~32.00 KiB) from the buffer pool
  - reads: 3 (~24.00 KiB) from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0

SQL query for yield ::Vulnerability.id_in(vulnerability_ids).with_states(:detected).with_resolution(true) Link to postgres.ai

SELECT 
  "vulnerabilities".* 
FROM 
  "vulnerabilities" 
WHERE 
  "vulnerabilities"."id" IN (9, 11) 
  AND "vulnerabilities"."state" = 1 
  AND "vulnerabilities"."resolved_on_default_branch" = TRUE

Plan

 Index Scan using index_vulnerabilities_on_detected_at_and_id on public.vulnerabilities  (cost=0.57..3.59 rows=1 width=453) (actual time=10.930..10.934 rows=1 loops=1)
   Index Cond: (vulnerabilities.id = 98820130)
   Filter: (vulnerabilities.resolved_on_default_branch AND (vulnerabilities.state = 1))
   Rows Removed by Filter: 0
   Buffers: shared hit=3 read=5 dirtied=1
   I/O Timings: read=10.714 write=0.000

Statistics

Time: 14.205 ms
  - planning: 3.186 ms
  - execution: 11.019 ms
    - I/O read: 10.714 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 3 (~24.00 KiB) from the buffer pool
  - reads: 5 (~40.00 KiB) from the OS file cache, including disk I/O
  - dirtied: 1 (~8.00 KiB)
  - writes: 0

Related to #428522

Edited by Schmil Monderer

Merge request reports