Recalculate UUID for all Vulnerability::Finding records, attempt 2
What does this MR do and why?
Schedule recalculating UUID for all Vulnerabilities::Finding
records. There are two major reasons for this:
-
We made a mistake in the algorithm when recalculating this for the first time. See:
https://gitlab.com/gitlab-org/gitlab/-/merge_requests/67578
-
The algorithm changed yet again to improve the tracking method. See:
https://gitlab.com/gitlab-org/gitlab/-/issues/322044
Changes:
-
Update background migration to use signatures
-
Make previous reschedule a no-op so customers doing major version leaps will not schedule the same job multiple times
-
Remove ALL jobs from
background_migration_jobs table
. Succeeded jobs should be removed anyway and pending jobs were not finished because we had records for which the recalculation yielded identical UUIDv5. The reason we had such records was due to mistake outlined in point 1. This situation was fixed byhttps://gitlab.com/gitlab-org/gitlab/-/merge_requests/74008
-
Schedule RecalculateVulnerabilitiesOccurrencesUuid over entire
vulnerability_occurrences
table so we can leverage our new algorithm and finish the migration path from UUIDv4 to UUIDv5.
Related to #341195 (closed)
Database review
db:migrate:up
and db:migrate:down
❯ bundle exec rails db:migrate:down VERSION=20211207125331
== 20211207125331 RemoveJobsForRecalculateVulnerabilitiesOccurrencesUuid: reverting
== 20211207125331 RemoveJobsForRecalculateVulnerabilitiesOccurrencesUuid: reverted (0.0000s)
❯ bundle exec rails db:migrate:up VERSION=20211207125331
== 20211207125331 RemoveJobsForRecalculateVulnerabilitiesOccurrencesUuid: migrating
== 20211207125331 RemoveJobsForRecalculateVulnerabilitiesOccurrencesUuid: migrated (0.0067s)
❯ bundle exec rails db:migrate:down VERSION=20211207135331
== 20211207135331 ScheduleRecalculateUuidOnVulnerabilitiesOccurrences4: reverting
== 20211207135331 ScheduleRecalculateUuidOnVulnerabilitiesOccurrences4: reverted (0.0000s)
❯ bundle exec rails db:migrate:up VERSION=20211207135331
== 20211207135331 ScheduleRecalculateUuidOnVulnerabilitiesOccurrences4: migrating
-- Scheduled 1 RecalculateVulnerabilitiesOccurrencesUuid jobs with a maximum of 2500 records per batch and an interval of 120 seconds.
The migration is expected to take at least 120 seconds. Expect all jobs to have completed after 2021-12-07 16:42:42 UTC."
== 20211207135331 ScheduleRecalculateUuidOnVulnerabilitiesOccurrences4: migrated (0.0754s)
db:migrate:up
and db:migrate:down
for no-op migrations
❯ bundle exec rails db:migrate:down VERSION=20210426225417
== 20210426225417 ScheduleRecalculateUuidOnVulnerabilitiesOccurrences2: reverting
== 20210426225417 ScheduleRecalculateUuidOnVulnerabilitiesOccurrences2: reverted (0.0000s)
❯ bundle exec rails db:migrate:up VERSION=20210426225417
== 20210426225417 ScheduleRecalculateUuidOnVulnerabilitiesOccurrences2: migrating
== 20210426225417 ScheduleRecalculateUuidOnVulnerabilitiesOccurrences2: migrated (0.0000s)
❯ bundle exec rails db:migrate:down VERSION=20210813195518
== 20210813195518 ScheduleRecalculateUuidOnVulnerabilitiesOccurrences3: reverting
== 20210813195518 ScheduleRecalculateUuidOnVulnerabilitiesOccurrences3: reverted (0.0000s)
❯ bundle exec rails db:migrate:up VERSION=20210813195518
== 20210813195518 ScheduleRecalculateUuidOnVulnerabilitiesOccurrences3: migrating
== 20210813195518 ScheduleRecalculateUuidOnVulnerabilitiesOccurrences3: migrated (0.0000s)
Vulnerabilities::Finding
and their Signatures
Selecting a batch of SELECT "vulnerability_occurrences"."id", "vulnerability_occurrences"."report_type", "fingerprint", "vulnerability_occurrences"."location_fingerprint", "vulnerability_occurrences"."project_id"
FROM "vulnerability_occurrences"
INNER JOIN "vulnerability_identifiers" ON "vulnerability_identifiers"."id" = "vulnerability_occurrences"."primary_identifier_id"
WHERE "vulnerability_occurrences"."id" BETWEEN 11524705 AND 11527474
cold cache: https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/7519/commands/26739
warm cache: https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/7519/commands/26740
SELECT "vulnerability_finding_signatures".* FROM "vulnerability_finding_signatures" WHERE "vulnerability_finding_signatures"."finding_id" BETWEEN 11524705 AND 11527474
cold cache: https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/7519/commands/26745
warm cache: https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/7519/commands/26746
Updating a batch
I can't get a regular query out of BulkImport
to copypaste into #database-lab. Last time this migration was executed (1 year ago) the UPDATE statement timing was estimated to be around 3-5 seconds on due to slow I/O on replica which was approved so I think we should be fine here since we don't change the batch size or anything.
The first attempt on this migration is !47529 (merged)
Irreversible migration
The background data migration introduced by this MR is designed to delete some vulnerability
records if they are duplicates. Some of the associated entities will be deleted manually in batches(due to the amount of the data) and the rest will be deleted by the foreign key constraints.
There is no way to bring back the deleted records as they are duplicates anyway.
If a duplicate gets found
This will get repeated for every duplicate found when trying to update a batch.
Dropping 1000 of VulnerabilityFindingPipelines
cold cache: https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/7646/commands/27159
warm cache: https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/7646/commands/27160
Dropping 1 Vulnerability
The most likely scenario unless someone did database changes manually
warm cache: https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/7646/commands/27163
Dropping 1000 Vulnerabilities
Technically speaking this shouldn't happen because for the entire time we have been associating a single Vulnerability with no more than one Finding. This can happen if you manually change it in the database though so I opted to err on the side of caution:
cold cache: https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/7646/commands/27161
warm cache: https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/7646/commands/27162
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.