Skip to content

Recalculate UUID for all Vulnerability::Finding records, attempt 2

What does this MR do and why?

Schedule recalculating UUID for all Vulnerabilities::Finding records. There are two major reasons for this:

  1. We made a mistake in the algorithm when recalculating this for the first time. See: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/67578

  2. The algorithm changed yet again to improve the tracking method. See: https://gitlab.com/gitlab-org/gitlab/-/issues/322044

Changes:

  • Update background migration to use signatures

  • Make previous reschedule a no-op so customers doing major version leaps will not schedule the same job multiple times

  • Remove ALL jobs from background_migration_jobs table. Succeeded jobs should be removed anyway and pending jobs were not finished because we had records for which the recalculation yielded identical UUIDv5. The reason we had such records was due to mistake outlined in point 1. This situation was fixed by https://gitlab.com/gitlab-org/gitlab/-/merge_requests/74008

  • Schedule RecalculateVulnerabilitiesOccurrencesUuid over entire vulnerability_occurrences table so we can leverage our new algorithm and finish the migration path from UUIDv4 to UUIDv5.

Related to #341195 (closed)

Database review

db:migrate:up and db:migrate:down

❯ bundle exec rails db:migrate:down VERSION=20211207125331
== 20211207125331 RemoveJobsForRecalculateVulnerabilitiesOccurrencesUuid: reverting
== 20211207125331 RemoveJobsForRecalculateVulnerabilitiesOccurrencesUuid: reverted (0.0000s)

❯ bundle exec rails db:migrate:up VERSION=20211207125331
== 20211207125331 RemoveJobsForRecalculateVulnerabilitiesOccurrencesUuid: migrating
== 20211207125331 RemoveJobsForRecalculateVulnerabilitiesOccurrencesUuid: migrated (0.0067s)

❯ bundle exec rails db:migrate:down VERSION=20211207135331
== 20211207135331 ScheduleRecalculateUuidOnVulnerabilitiesOccurrences4: reverting
== 20211207135331 ScheduleRecalculateUuidOnVulnerabilitiesOccurrences4: reverted (0.0000s)

❯ bundle exec rails db:migrate:up VERSION=20211207135331
== 20211207135331 ScheduleRecalculateUuidOnVulnerabilitiesOccurrences4: migrating
-- Scheduled 1 RecalculateVulnerabilitiesOccurrencesUuid jobs with a maximum of 2500 records per batch and an interval of 120 seconds.

The migration is expected to take at least 120 seconds. Expect all jobs to have completed after 2021-12-07 16:42:42 UTC."
== 20211207135331 ScheduleRecalculateUuidOnVulnerabilitiesOccurrences4: migrated (0.0754s)

db:migrate:up and db:migrate:down for no-op migrations

❯ bundle exec rails db:migrate:down VERSION=20210426225417
== 20210426225417 ScheduleRecalculateUuidOnVulnerabilitiesOccurrences2: reverting
== 20210426225417 ScheduleRecalculateUuidOnVulnerabilitiesOccurrences2: reverted (0.0000s)

❯ bundle exec rails db:migrate:up VERSION=20210426225417
== 20210426225417 ScheduleRecalculateUuidOnVulnerabilitiesOccurrences2: migrating
== 20210426225417 ScheduleRecalculateUuidOnVulnerabilitiesOccurrences2: migrated (0.0000s)

❯ bundle exec rails db:migrate:down VERSION=20210813195518
== 20210813195518 ScheduleRecalculateUuidOnVulnerabilitiesOccurrences3: reverting
== 20210813195518 ScheduleRecalculateUuidOnVulnerabilitiesOccurrences3: reverted (0.0000s)

❯ bundle exec rails db:migrate:up VERSION=20210813195518
== 20210813195518 ScheduleRecalculateUuidOnVulnerabilitiesOccurrences3: migrating
== 20210813195518 ScheduleRecalculateUuidOnVulnerabilitiesOccurrences3: migrated (0.0000s)

Selecting a batch of Vulnerabilities::Finding and their Signatures

SELECT "vulnerability_occurrences"."id", "vulnerability_occurrences"."report_type", "fingerprint", "vulnerability_occurrences"."location_fingerprint", "vulnerability_occurrences"."project_id"
FROM "vulnerability_occurrences"
INNER JOIN "vulnerability_identifiers" ON "vulnerability_identifiers"."id" = "vulnerability_occurrences"."primary_identifier_id"
WHERE "vulnerability_occurrences"."id" BETWEEN 11524705 AND 11527474

cold cache: https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/7519/commands/26739

warm cache: https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/7519/commands/26740

SELECT "vulnerability_finding_signatures".* FROM "vulnerability_finding_signatures" WHERE "vulnerability_finding_signatures"."finding_id" BETWEEN 11524705 AND 11527474

cold cache: https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/7519/commands/26745

warm cache: https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/7519/commands/26746

Updating a batch

I can't get a regular query out of BulkImport to copypaste into #database-lab. Last time this migration was executed (1 year ago) the UPDATE statement timing was estimated to be around 3-5 seconds on due to slow I/O on replica which was approved so I think we should be fine here since we don't change the batch size or anything.

The first attempt on this migration is !47529 (merged)

Irreversible migration

The background data migration introduced by this MR is designed to delete some vulnerability records if they are duplicates. Some of the associated entities will be deleted manually in batches(due to the amount of the data) and the rest will be deleted by the foreign key constraints.

There is no way to bring back the deleted records as they are duplicates anyway.

If a duplicate gets found

This will get repeated for every duplicate found when trying to update a batch.

Dropping 1000 of VulnerabilityFindingPipelines

cold cache: https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/7646/commands/27159

warm cache: https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/7646/commands/27160

Dropping 1 Vulnerability

The most likely scenario unless someone did database changes manually

warm cache: https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/7646/commands/27163

Dropping 1000 Vulnerabilities

Technically speaking this shouldn't happen because for the entire time we have been associating a single Vulnerability with no more than one Finding. This can happen if you manually change it in the database though so I opted to err on the side of caution:

cold cache: https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/7646/commands/27161

warm cache: https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/7646/commands/27162

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Mehmet Emin INAC

Merge request reports