Migrate the remediation data into their own records
What does this MR do and why?
Migrate the remediation data into their own records from the json attribute
Related issue: https://gitlab.com/gitlab-org/gitlab/-/issues/239183
DB migrate
$ bundle exec rails db:migrate
main: == 20230118144623 ScheduleMigrationForRemediation: migrating ==================
main: == 20230118144623 ScheduleMigrationForRemediation: migrated (0.0484s) =========
ci: == 20230118144623 ScheduleMigrationForRemediation: migrating ==================
ci: -- The migration is skipped since it modifies the schemas: [:gitlab_main].
ci: -- This database can only apply migrations in one of the following schemas: [:gitlab_ci, :gitlab_shared, :gitlab_internal].
ci: == 20230118144623 ScheduleMigrationForRemediation: migrated (0.0036s) =========
$ bundle exec rails db:rollback:main
main: == 20230118144623 ScheduleMigrationForRemediation: reverting ==================
main: == 20230118144623 ScheduleMigrationForRemediation: reverted (0.0317s) =========
Query plan
Loading batch/subbatch:
select * from vulnerability_occurrences where id between 0 and 400;
.
https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/15126/commands/52537.
Time(worst case): 77,255 ms
Fetching existing checksum:
select checksum, id from vulnerability_remediations where project_id = 15080436 and checksum IN ('\x0002fa41d57a269f9706a432d1f91b56637900a3a8aa78806f062ceaa606e4af', '\x00278bb77c6e8455ed6e515f2b6f9d7a0578779a0b7e497862fff7b07bbb88f4', '\x012f281d61aa0edf5f147a768bd15b0dfe0172057b97554b1739ca8447a29c26', '\x018358199d2aa3a29039e74c0b4a2a347d2bc653b278b9ed0edeb41b403c31a2', '\x015861ea25890c67e165018b96249be7269d916365d52730e141de3b70410352', '\x01a4bb50805c7b2878b2f576e7ceb4e63764ea87791c1c43f9ca0699618566e6');
.
https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/15126/commands/52530.
Time(worst case): 4,441 ms
Creating remediation record:
insert into vulnerability_remediations (project_id, checksum, created_at, updated_at, summary, file) values (16411240, '\x0002fa41d57a269f9706a432d1f91b56637900a3a8aa78806f062ceaa606e4af', now(), now(), 'test', 'file' );
.
https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/15126/commands/52531.
Time(worst case): 42,096 ms
Creating finding/remediation relationship record:
insert into vulnerability_findings_remediations (vulnerability_occurrence_id, vulnerability_remediation_id, created_at, updated_at) values (1,48472,now(),now());
.
https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/15126/commands/52532.
Time(worst case): 103,341 ms
With a subbatch size length set to 400
the total time per subbatch would be 59951,2 ms
.
vulnerability_occurrences
currently has 66264176
records. With subbatch set to 400
and delay set to 2 minutes
, it would require the following amount of days for the migration to complete:
66264176 records / 400 => 165660,44 times x 2 minutes => 331320,88 min / 1440 => 230,083944444444444 days
edit (after the last commit): with the changes performed in the latest commit, these are the estimate of completion:
66264176 records / 4000 => 16566,044 times x 2 minutes => 33132,088 min / 1440 => 23,008394444444444 days
How to set up and validate locally
Numbered steps to set up and validate the change are strongly suggested.
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.