Skip to content

Mark stale `security_scans` as `purged`

What does this MR do and why?

This MR introduces a background data migration to mark the stale security_scans records as purged.

Related to #351524 (closed).

Database review

Post-deployment migration

This MR introduces a post-deployment migration to schedule the data migration jobs.

Command outputs
Post-deployment migration
rake db:migrate:up
== 20220407163559 SchedulePurgingStaleSecurityScans: migrating ================
-- transaction_open?()
   -> 0.0000s
-- Scheduled 1 PurgeStaleSecurityScans jobs with a maximum of 10000 records per batch and an interval of 120 seconds.

The migration is expected to take at least 120 seconds. Expect all jobs to have completed after 2022-05-17 13:52:01 UTC."
== 20220407163559 SchedulePurgingStaleSecurityScans: migrated (0.1447s) =======
rake db:migrate:down
== 20220407163559 SchedulePurgingStaleSecurityScans: reverting ================
== 20220407163559 SchedulePurgingStaleSecurityScans: reverted (0.0000s) =======
Queries used
1) Get the last stale record ID
SELECT
    "security_scans".*
FROM
    "security_scans"
WHERE (created_at < '2022-02-16 13:51:08.342527')
ORDER BY
    "security_scans"."created_at" DESC
LIMIT 1
 Limit  (cost=0.44..0.50 rows=1 width=64) (actual time=5.642..5.644 rows=1 loops=1)
   Buffers: shared read=4
   I/O Timings: read=5.588 write=0.000
   ->  Index Scan using index_security_scans_on_created_at on public.security_scans  (cost=0.44..1105061.25 rows=16808981 width=64) (actual time=5.640..5.640 rows=1 loops=1)
         Index Cond: (security_scans.created_at < '2022-02-16 13:51:08.342527+00'::timestamp with time zone)
         Buffers: shared read=4
         I/O Timings: read=5.588 write=0.000

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/10159/commands/35846

2) Get the lower bound of the first batch
SELECT
    "security_scans"."id"
FROM
    "security_scans"
WHERE (id <= 17535986)
ORDER BY
    "security_scans"."id" ASC
LIMIT 1
Limit  (cost=0.44..0.46 rows=1 width=8) (actual time=4.149..4.151 rows=1 loops=1)
   Buffers: shared read=4
   I/O Timings: read=4.089 write=0.000
   ->  Index Only Scan using security_scans_pkey on public.security_scans  (cost=0.44..458811.91 rows=16807480 width=8) (actual time=4.147..4.148 rows=1 loops=1)
         Index Cond: (security_scans.id <= 17535986)
         Heap Fetches: 0
         Buffers: shared read=4
         I/O Timings: read=4.089 write=0.000

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/10159/commands/35847

3) Get the lower bound of the next batch
SELECT
    "security_scans"."id"
FROM
    "security_scans"
WHERE (id <= 17535986)
    AND "security_scans"."id" >= 1
ORDER BY
    "security_scans"."id" ASC
LIMIT 1 OFFSET 10000
Limit  (cost=298.42..298.45 rows=1 width=8) (actual time=39.322..39.324 rows=1 loops=1)
   Buffers: shared hit=25 read=31 dirtied=2
   I/O Timings: read=35.231 write=0.000
   ->  Index Only Scan using security_scans_pkey on public.security_scans  (cost=0.44..500830.61 rows=16807480 width=8) (actual time=0.036..38.663 rows=10001 loops=1)
         Index Cond: ((security_scans.id <= 17535986) AND (security_scans.id >= 1))
         Heap Fetches: 8
         Buffers: shared hit=25 read=31 dirtied=2
         I/O Timings: read=35.231 write=0.000

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/10159/commands/35848

Background job

The background job receives the range of scan IDs and updates the security_scans records as purged in batches.

1)Get the lower bound of the batch
SELECT
    "security_scans"."id"
FROM
    "security_scans"
WHERE
    "security_scans"."id" BETWEEN 1 AND 10381
ORDER BY
    "security_scans"."id" ASC
LIMIT 1
Limit  (cost=0.44..0.53 rows=1 width=8) (actual time=0.048..0.049 rows=1 loops=1)
   Buffers: shared hit=7
   I/O Timings: read=0.000 write=0.000
   ->  Index Only Scan using security_scans_pkey on public.security_scans  (cost=0.44..965.52 rows=10057 width=8) (actual time=0.046..0.046 rows=1 loops=1)
         Index Cond: ((security_scans.id >= 1) AND (security_scans.id <= 10381))
         Heap Fetches: 0
         Buffers: shared hit=7
         I/O Timings: read=0.000 write=0.000

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/10159/commands/35849

2)Get the lower bound of the next batch
SELECT
    "security_scans"."id"
FROM
    "security_scans"
WHERE
    "security_scans"."id" BETWEEN 1 AND 10381
ORDER BY
    "security_scans"."id" ASC
LIMIT 1 OFFSET 1000
Limit  (cost=96.40..96.49 rows=1 width=8) (actual time=0.305..0.306 rows=1 loops=1)
   Buffers: shared hit=11
   I/O Timings: read=0.000 write=0.000
   ->  Index Only Scan using security_scans_pkey on public.security_scans  (cost=0.44..965.52 rows=10057 width=8) (actual time=0.048..0.233 rows=1001 loops=1)
         Index Cond: ((security_scans.id >= 1) AND (security_scans.id <= 10381))
         Heap Fetches: 8
         Buffers: shared hit=11
         I/O Timings: read=0.000 write=0.000

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/10159/commands/35850

2) Mark the `security_scans` records as `purged`
UPDATE
    "security_scans"
SET
    "status" = 6
WHERE
    "security_scans"."id" BETWEEN 1 AND 10381
    AND "security_scans"."id" >= 1
    AND "security_scans"."id" < 1044
    AND "security_scans"."status" = 1
ModifyTable on public.security_scans  (cost=0.44..612.15 rows=729 width=74) (actual time=2179.728..2179.730 rows=0 loops=1)
   Buffers: shared hit=31216 read=1363 dirtied=1197 written=20
   I/O Timings: read=2047.628 write=0.000
   ->  Index Scan using security_scans_pkey on public.security_scans  (cost=0.44..612.15 rows=729 width=74) (actual time=7.471..160.033 rows=1000 loops=1)
         Index Cond: ((security_scans.id >= 1) AND (security_scans.id <= 10381) AND (security_scans.id >= 1) AND (security_scans.id < 1044))
         Filter: (security_scans.status = 1)
         Rows Removed by Filter: 0
         Buffers: shared hit=23 read=86
         I/O Timings: read=156.537 write=0.000

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/10159/commands/35851

Estimations

As of writing, the security_scans table has 16_780_661 records created 90 days ago. Given the batch size of 10_000 scans, this migration will schedule around 1_678 background jobs. If each takes 2 minutes to complete, the whole migration process will take around 2 days.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Mehmet Emin INAC

Merge request reports