Skip to content

Add a background migration to fix the `namespace_id`s read records

What does this MR do and why?

This change introduces a background migration to fix the data integrity issue related to namespace_ids of the vulnerability_reads table.

The migration will pass through all the vulnerability_reads records and update the wrong namespace_ids with the correct ones.

Database review

This migration will iterate over ~74M records and will update 1_457_788 records. With the batch size of 10K, there will be ~7_400 batch migration jobs created. With the interval of 2 minutes, we are expecting this migration to be completed in (7400 * 2 / 60 / 24) = ~10 Days. If we increase the batch size to 50K, this would reduce the ETA to ~5 days. This is an option for database maintainer to consider.

rake command outputs

rake db:migrate:up
main: == [advisory_lock_connection] object_id: 242200, pg_backend_pid: 50347
main: == 20230810124545 ScheduleFixingNamespaceIdsOfVulnerabilityReads: migrating ===
main: == 20230810124545 ScheduleFixingNamespaceIdsOfVulnerabilityReads: migrated (0.0753s)

main: == [advisory_lock_connection] object_id: 242200, pg_backend_pid: 50347
rake db:migrate:down
main: == [advisory_lock_connection] object_id: 224680, pg_backend_pid: 50851
main: == 20230810124545 ScheduleFixingNamespaceIdsOfVulnerabilityReads: reverting ===
main: == 20230810124545 ScheduleFixingNamespaceIdsOfVulnerabilityReads: reverted (0.0315s)

main: == [advisory_lock_connection] object_id: 224680, pg_backend_pid: 50851

Queries

The batching will be done for the vulnerability_reads table by using the vulnerability_id column and there is already an index for that column;

CREATE UNIQUE INDEX index_vulnerability_reads_on_vulnerability_id ON vulnerability_reads USING btree (vulnerability_id);
The update query will update at most 100 records each time and it looks something like this
UPDATE
    "vulnerability_reads"
SET
    "namespace_id" = 1
WHERE
    "vulnerability_reads"."vulnerability_id" IN (1, 2, 3, 4, 5, 6, 7, 8, 12, 23, 24, 25, 26, 27, 47, 48, 49, 50, 53, 60, 62, 63, 65, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 90, 91, 92, 93, 95, 101, 102, 105, 106, 107, 108, 119, 120, 122, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 170, 171, 175, 177, 178, 180, 181, 182, 183, 184, 185, 186, 188, 189, 190, 191, 192, 200, 205, 208, 210, 212, 213, 214, 216, 217, 218, 219)

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/21544/commands/70102

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related issue #408764 (closed)

Edited by Mehmet Emin INAC

Merge request reports