Skip to content

Migrate erroneously shared vulnerability identifiers

Summary

Vulnerabilities::Identifier records were erroneously reused across projects: https://gitlab.com/gitlab-org/gitlab/-/issues/354082#note_903368766.

Specifically, for concerned rows, the vulnerability_occurrences.project_id column mismatches the associated vulnerability_identifiers.project_id. This is a permission issue, because project owners cannot access their vulnerability identifiers, as they belong to another project.

We need to create new identifier rows and update the foreign keys of concerned findings accordingly.

Merged MR that fixes the cause: !84607 (merged)

Steps to reproduce

n/a

Example Project

https://gitlab.com/gitlab-org/protect/demos/demo-cluster

What is the current bug behavior?

Identifiers reused across projects

What is the expected correct behavior?

Identifiers are created per project and not shared

Relevant logs and/or screenshots

n/a

Output of checks

n/a

Implementation plan

  • backend database Identify findings with incorrect identifiers (where identifier.project_id is different than finding.project_id) and aggregate finding ids by vulnerability_identifier.id
SELECT
  vulnerability_identifiers.id, array_agg(vulnerability_occurrences.id)
FROM
  vulnerability_occurrences
  JOIN vulnerability_occurrence_identifiers ON vulnerability_occurrence_identifiers.occurrence_id = vulnerability_occurrences.id
  JOIN vulnerability_identifiers ON vulnerability_occurrence_identifiers.identifier_id = vulnerability_identifiers.id
WHERE
  vulnerability_identifiers.project_id != vulnerability_occurrences.project_id
  AND vulnerability_occurrences.report_type in (7, 99)
GROUP BY vulnerability_identifiers.id;
  • backend database prepare background migration to:
    • for each finding (we have around ~80k findings with this problem) found by query presented in the last point, duplicate the identifier and set correct vulnerability_identifiers.project_id as vulnerability_occurrences.project_id for all identifiers to vulnerability_occurrences.project_id => this will solve the problem where identifier was not accessible on GraphQL API,
    • delete all missmatching identifiers from database:
SELECT
  vulnerability_identifiers.id
FROM
  vulnerability_occurrences
  JOIN vulnerability_occurrence_identifiers ON vulnerability_occurrence_identifiers.occurrence_id = vulnerability_occurrences.id
  JOIN vulnerability_identifiers ON vulnerability_occurrence_identifiers.identifier_id = vulnerability_identifiers.id
WHERE
  vulnerability_identifiers.project_id != vulnerability_occurrences.project_id
  AND vulnerability_occurrences.report_type in (7, 99)
Edited by Alan (Maciej) Paruszewski