Skip to content

Populate missing dismissal information for vulnerabilities

What does this MR do?

This MR populates missing dismissal information for the vulnerability records with an inline data migration.

Related to #287803 (closed).

Database review

There are only 510 records to be migrated with this migration, this is why an inline migration is chosen.

The lookup queries take less than a second on production replica with this script.
Vulnerability.include(EachBatch)

start = Gitlab::Metrics::System.monotonic_time
Vulnerability.dismissed.where('dismissed_at IS NULL OR dismissed_by_id IS NULL').each_batch(of: 100) do |batch|
  batch.pluck(:id)
end
diff = Gitlab::Metrics::System.monotonic_time - start
puts diff # => 0.8272816969547421
Here are the full set of queries
D, [2021-03-25T10:55:37.053021 #27837] DEBUG -- :   Vulnerability Load (2.3ms)  /*application:console*/ SELECT "vulnerabilities"."id" FROM "vulnerabilities" WHERE "vulnerabilities"."state" = 2 AND (dismissed_at IS NULL OR dismissed_by_id IS NULL) ORDER BY "vulnerabilities"."id" ASC LIMIT 1
D, [2021-03-25T10:55:37.080241 #27837] DEBUG -- :   Vulnerability Load (2.3ms)  /*application:console*/ SELECT "vulnerabilities"."id" FROM "vulnerabilities" WHERE "vulnerabilities"."state" = 2 AND (dismissed_at IS NULL OR dismissed_by_id IS NULL) AND "vulnerabilities"."id" >= 1406747 ORDER BY "vulnerabilities"."id" ASC LIMIT 1 OFFSET 100
D, [2021-03-25T10:55:37.083941 #27837] DEBUG -- :    (2.4ms)  /*application:console*/ SELECT "vulnerabilities"."id" FROM "vulnerabilities" WHERE "vulnerabilities"."state" = 2 AND (dismissed_at IS NULL OR dismissed_by_id IS NULL) AND "vulnerabilities"."id" >= 1406747 AND "vulnerabilities"."id" < 2135692
D, [2021-03-25T10:55:37.088253 #27837] DEBUG -- :   Vulnerability Load (3.3ms)  /*application:console*/ SELECT "vulnerabilities"."id" FROM "vulnerabilities" WHERE "vulnerabilities"."state" = 2 AND (dismissed_at IS NULL OR dismissed_by_id IS NULL) AND "vulnerabilities"."id" >= 2135692 ORDER BY "vulnerabilities"."id" ASC LIMIT 1 OFFSET 100
D, [2021-03-25T10:55:37.090572 #27837] DEBUG -- :    (1.3ms)  /*application:console*/ SELECT "vulnerabilities"."id" FROM "vulnerabilities" WHERE "vulnerabilities"."state" = 2 AND (dismissed_at IS NULL OR dismissed_by_id IS NULL) AND "vulnerabilities"."id" >= 2135692 AND "vulnerabilities"."id" < 2847317
D, [2021-03-25T10:55:37.092652 #27837] DEBUG -- :   Vulnerability Load (1.2ms)  /*application:console*/ SELECT "vulnerabilities"."id" FROM "vulnerabilities" WHERE "vulnerabilities"."state" = 2 AND (dismissed_at IS NULL OR dismissed_by_id IS NULL) AND "vulnerabilities"."id" >= 2847317 ORDER BY "vulnerabilities"."id" ASC LIMIT 1 OFFSET 100
D, [2021-03-25T10:55:37.095150 #27837] DEBUG -- :    (1.7ms)  /*application:console*/ SELECT "vulnerabilities"."id" FROM "vulnerabilities" WHERE "vulnerabilities"."state" = 2 AND (dismissed_at IS NULL OR dismissed_by_id IS NULL) AND "vulnerabilities"."id" >= 2847317 AND "vulnerabilities"."id" < 2848959
D, [2021-03-25T10:55:37.096652 #27837] DEBUG -- :   Vulnerability Load (0.6ms)  /*application:console*/ SELECT "vulnerabilities"."id" FROM "vulnerabilities" WHERE "vulnerabilities"."state" = 2 AND (dismissed_at IS NULL OR dismissed_by_id IS NULL) AND "vulnerabilities"."id" >= 2848959 ORDER BY "vulnerabilities"."id" ASC LIMIT 1 OFFSET 100
D, [2021-03-25T10:55:37.098329 #27837] DEBUG -- :    (0.9ms)  /*application:console*/ SELECT "vulnerabilities"."id" FROM "vulnerabilities" WHERE "vulnerabilities"."state" = 2 AND (dismissed_at IS NULL OR dismissed_by_id IS NULL) AND "vulnerabilities"."id" >= 2848959 AND "vulnerabilities"."id" < 2851660
D, [2021-03-25T10:55:37.100273 #27837] DEBUG -- :   Vulnerability Load (1.2ms)  /*application:console*/ SELECT "vulnerabilities"."id" FROM "vulnerabilities" WHERE "vulnerabilities"."state" = 2 AND (dismissed_at IS NULL OR dismissed_by_id IS NULL) AND "vulnerabilities"."id" >= 2851660 ORDER BY "vulnerabilities"."id" ASC LIMIT 1 OFFSET 100
D, [2021-03-25T10:55:37.101651 #27837] DEBUG -- :    (0.7ms)  /*application:console*/ SELECT "vulnerabilities"."id" FROM "vulnerabilities" WHERE "vulnerabilities"."state" = 2 AND (dismissed_at IS NULL OR dismissed_by_id IS NULL) AND "vulnerabilities"."id" >= 2851660 AND "vulnerabilities"."id" < 3395455
D, [2021-03-25T10:55:37.104311 #27837] DEBUG -- :   Vulnerability Load (1.9ms)  /*application:console*/ SELECT "vulnerabilities"."id" FROM "vulnerabilities" WHERE "vulnerabilities"."state" = 2 AND (dismissed_at IS NULL OR dismissed_by_id IS NULL) AND "vulnerabilities"."id" >= 3395455 ORDER BY "vulnerabilities"."id" ASC LIMIT 1 OFFSET 100
D, [2021-03-25T10:55:37.105703 #27837] DEBUG -- :    (0.7ms)  /*application:console*/ SELECT "vulnerabilities"."id" FROM "vulnerabilities" WHERE "vulnerabilities"."state" = 2 AND (dismissed_at IS NULL OR dismissed_by_id IS NULL) AND "vulnerabilities"."id" >= 3395455
Query plans

All the execution plans are with a cold cache.

The query to load the first ID of the vulnerabilities
SELECT
    "vulnerabilities"."id"
FROM
    "vulnerabilities"
WHERE
    "vulnerabilities"."state" = 2
    AND (dismissed_at IS NULL
        OR dismissed_by_id IS NULL)
ORDER BY
    "vulnerabilities"."id" ASC
LIMIT 1
Limit  (cost=0.27..0.32 rows=1 width=8) (actual time=6.656..6.657 rows=1 loops=1)
   Buffers: shared read=4 dirtied=1
   I/O Timings: read=5.674
   ->  Index Only Scan using temporary_index_vulnerabilities_on_id on public.vulnerabilities  (cost=0.27..18467.49 rows=379891 width=8) (actual time=6.653..6.654 rows=1 loops=1)
         Heap Fetches: 1
         Buffers: shared read=4 dirtied=1
         I/O Timings: read=5.674
The query to load the IDs of vulnerability with pluck
SELECT
    "vulnerabilities"."id"
FROM
    "vulnerabilities"
WHERE
    "vulnerabilities"."state" = 2
    AND (dismissed_at IS NULL
        OR dismissed_by_id IS NULL)
    AND "vulnerabilities"."id" >= 1406747
ORDER BY
    "vulnerabilities"."id" ASC
LIMIT 1 OFFSET 100
Limit  (cost=5.88..5.94 rows=1 width=8) (actual time=16.725..16.727 rows=1 loops=1)
   Buffers: shared hit=17 read=15 dirtied=7
   I/O Timings: read=15.460
   ->  Index Only Scan using temporary_index_vulnerabilities_on_id on public.vulnerabilities  (cost=0.27..17370.27 rows=309802 width=8) (actual time=0.099..16.710 rows=101 loops=1)
         Index Cond: (vulnerabilities.id >= 1406747)
         Heap Fetches: 15
         Buffers: shared hit=17 read=15 dirtied=7
         I/O Timings: read=15.460
The query to update the records with vulnerability IDs
UPDATE
    vulnerabilities
SET
    dismissed_at = updated_at,
    dismissed_by_id = COALESCE(updated_by_id, last_edited_by_id, author_id)
WHERE
    vulnerabilities.id IN (...) --- Used exactly 100 IDs returned by the previous queries but removed here to make it easier to read
ModifyTable on public.vulnerabilities  (cost=0.43..321.46 rows=100 width=296) (actual time=460.749..460.750 rows=0 loops=1)
   Buffers: shared hit=6436 read=371 dirtied=216
   I/O Timings: read=441.564
   ->  Index Scan using vulnerabilities_pkey on public.vulnerabilities  (cost=0.43..321.46 rows=100 width=296) (actual time=12.641..119.134 rows=100 loops=1)
         Index Cond: (vulnerabilities.id = ANY ('{...}'::bigint[]))
         Buffers: shared hit=302 read=98
         I/O Timings: read=117.660

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • [-] Label as security and @ mention @gitlab-com/gl-security/appsec
  • [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • [-] Security reports checked/validated by a reviewer from the AppSec team
Edited by Mehmet Emin INAC

Merge request reports