Skip to content

Add MigrateSharedVulnerabilityIdentifiers background migration

What does this MR do and why?

DO NOT MERGE until the temporary index from !113348 (merged) is built!

Adds a background migration and a temporary index to fix erroneously reused vulnerability identifiers.

From #359963 (closed):

Vulnerabilities::Identifier records were erroneously reused across projects: https://gitlab.com/gitlab-org/gitlab/-/issues/354082#note_903368766.

Specifically, for concerned rows, the vulnerability_occurrences.project_id column mismatches the associated vulnerability_identifiers.project_id. This is a permission issue, because project owners cannot access their vulnerability identifiers, as they belong to another project.

We need to create new identifier rows and update the foreign keys of concerned findings accordingly.

Merged MR that fixes the cause: !84607 (merged)

Estimate

This concerns ~165k identifiers and findings.

For a batch size of 1000, this results in ~165 batches. For a batch runtime of 10s and a delay interval of 2min, total estimated runtime is ~6h.

Database

Temporary index

CREATE INDEX tmp_idx_vulnerability_occurrences_on_id_where_report_type_7_99 ON vulnerability_occurrences USING btree (id) WHERE (report_type = ANY (ARRAY[7, 99]));

Queries

Batch

postgres.ai

SELECT DISTINCT
    "vulnerability_occurrences".*
FROM
    "vulnerability_occurrences"
    INNER JOIN "vulnerability_occurrence_identifiers" ON "vulnerability_occurrence_identifiers"."occurrence_id" = "vulnerability_occurrences"."id"
    INNER JOIN "vulnerability_occurrence_identifiers" "finding_identifiers_vulnerability_occurrences_join" ON "finding_identifiers_vulnerability_occurrences_join"."occurrence_id" = "vulnerability_occurrences"."id"
    INNER JOIN "vulnerability_identifiers" ON "vulnerability_identifiers"."id" = "finding_identifiers_vulnerability_occurrences_join"."identifier_id"
WHERE
    "vulnerability_occurrences"."report_type" IN (7, 99)
    AND (vulnerability_occurrences.project_id != vulnerability_identifiers.project_id)
    AND "vulnerability_occurrences"."id" BETWEEN 133989379 AND 133989479
    AND "vulnerability_occurrences"."id" >= 133989379
    AND "vulnerability_occurrences"."id" < 133989429;

Finding identifiers

postgres.ai

SELECT
    "vulnerability_identifiers".*
FROM
    "vulnerability_identifiers"
    INNER JOIN "vulnerability_occurrence_identifiers" ON "vulnerability_identifiers"."id" = "vulnerability_occurrence_identifiers"."identifier_id"
WHERE
    "vulnerability_occurrence_identifiers"."occurrence_id" = 145705877;

Upsert identifier

postgres.ai

INSERT INTO "vulnerability_identifiers" ("created_at", "updated_at", "project_id", "fingerprint", "external_type", "external_id", "name", "url")
    VALUES ('2022-02-02 16:13:30.793515', '2022-02-11 10:36:33.457788', 27933106, 'd70fec41fd56c50abc1b7a3eb161e68cef734a6d', 'cve', 'CVE-2022-22822', 'CVE-2022-22822','http://www.openwall.com/lists/oss-security/2022/01/17/3')
ON CONFLICT ("project_id", "fingerprint")
    DO UPDATE SET
        "created_at" = excluded."created_at", "updated_at" = excluded."updated_at", "external_type" = excluded."external_type", "external_id" = excluded."external_id", "name" = excluded."name", "url" = excluded."url"
    RETURNING
        "id";

Update join table

postgres.ai

UPDATE
    "vulnerability_occurrence_identifiers"
SET
    "identifier_id" = 14002071
WHERE
    "vulnerability_occurrence_identifiers"."occurrence_id" = 133989382
    AND "vulnerability_occurrence_identifiers"."identifier_id" = 14002071;

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #359963 (closed)

Edited by Dominic Bauer

Merge request reports

Loading