Skip to content

Purge old `security_scans` records on self-hosted installations

What does this MR do and why?

This change introduces a background migration to purge security_scans records associated with security_findings without finding_data set.

These kinds of records do not exist on GitLab.com as the finding_data column was introduced in '%15.6' and the records created before that milestone are already removed from the database by the data retention policy of 3 months on the security_findings table but there is no guarantee for the self-hosted installations.

Therefore, the background migration introduced in this MR will only run for self-hosted installations.

Related to Run migration to purge `security_scans` associa... (#409963 - closed).

Database review

There are two different sets of queries introduced in this MR. First set of queries run in post-deployment migration and the second set of queries run in background migration.

In addition to those queries, this MR introduces a temporary index to prevent having timeouts while running those queries.

Post-deployment queries
Get the first succeeded security scan
SELECT
    "security_scans".*
FROM
    "security_scans"
WHERE
    "security_scans"."status" = 1
ORDER BY
    "security_scans"."id" ASC
LIMIT 1

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/26126/commands/82244

Get the first finding of the first succeeded security scan
SELECT
    "security_findings".*
FROM
    "security_findings"
WHERE
    "security_findings"."scan_id" = 81053836
ORDER BY
    "security_findings"."id" ASC
LIMIT 1

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/26126/commands/82253

Background migration queries
Query generated by the `scope_to` macro
SELECT
    "security_scans"."id"
FROM
    "security_scans"
WHERE
    "security_scans"."id" BETWEEN 81605760 AND 81712970
    AND "security_scans"."status" = 1
    AND "security_scans"."id" >= 81605760
    AND "security_scans"."id" < 81607765

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/26126/commands/82254

Query to load only the first findings of all the security scans in batch
SELECT
    "security_findings".*
FROM
    unnest('{81605760, 81605762, 81605665, 81605706, 81605707, 81605746, 81605747, 81605748, 81605751, 81605752, 81605753, 81605738, 81605739, 81605787, 81605788, 81606149, 81606150, 81606152, 81605830, 81605831, 81605832, 81605764, 81605765, 81605789, 81605790, 81606168, 81606169, 81606170, 81606171, 81606172, 81605791, 81605792, 81605833, 81605834, 81605773, 81605886, 81605887, 81605888, 81605889, 81605812, 81605813, 81605817, 81605822, 81605818, 81605819, 81605820, 81605821, 81605823, 81605824, 81605794, 81605796, 81605798, 81607728, 81607730, 81605869, 81605871, 81605876, 81605877, 81607750, 81607754, 81607756, 81607759, 81607762, 81607767, 81605947, 81605948, 81605949, 81605950, 81605908, 81605909, 81605910, 81605911, 81605912, 81605920, 81605921, 81605922, 81605923, 81605924, 81605034, 81605035, 81605036, 81605953, 81605954, 81605955, 81605956, 81605957, 81605972, 81605973, 81605974, 81605975, 81606006, 81606007, 81605952, 81607739, 81607747, 81607751, 81607763, 81607760, 81607764, 81607765}'::bigint[]) AS scan_ids (id),
    LATERAL (
        SELECT
            sub_findings.*
        FROM
            security_findings sub_findings
        WHERE
            sub_findings.scan_id = scan_ids.id
        ORDER BY
            sub_findings.id ASC
        LIMIT 1) AS security_findings

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/26126/commands/82255

Update query
UPDATE
    "security_scans"
SET
    "status" = 6
WHERE
    "security_scans"."id" IN (81605760, 81605762, 81605665, 81605706, 81605707, 81605746, 81605747, 81605748, 81605751, 81605752, 81605753, 81605738, 81605739, 81605787, 81605788, 81606149, 81606150, 81606152, 81605830, 81605831, 81605832, 81605764, 81605765, 81605789, 81605790, 81606168, 81606169, 81606170, 81606171, 81606172, 81605791, 81605792, 81605833, 81605834, 81605773, 81605886, 81605887, 81605888, 81605889, 81605812, 81605813, 81605817, 81605822, 81605818, 81605819, 81605820, 81605821, 81605823, 81605824, 81605794, 81605796, 81605798, 81607728, 81607730, 81605869, 81605871, 81605876, 81605877, 81607750, 81607754, 81607756, 81607759, 81607762, 81607767, 81605947, 81605948, 81605949, 81605950, 81605908, 81605909, 81605910, 81605911, 81605912, 81605920, 81605921, 81605922, 81605923, 81605924, 81605034, 81605035, 81605036, 81605953, 81605954, 81605955, 81605956, 81605957, 81605972, 81605973, 81605974, 81605975, 81606006, 81606007, 81605952, 81607739, 81607747, 81607751, 81607763, 81607760, 81607764, 81607765)

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/26126/commands/82256

Edited by Mehmet Emin INAC

Merge request reports