Populate data for vulnerability_reads from vulnerabilities
What does this MR do and why?
This MR is dependent on !75546 (merged) and only after the background migration is finished, this could be merged
Addresses #335501 (closed)
This MR adds a background migration to populate vulnerability_reads table which was introduced in !74733 (merged)
We are implementing few triggers to populate vulnerability_reads synchronously and this migration will populate the rows for older data. The triggers MR is still in review: !75230 (merged)
Database Changes
Migrate Up
bundle exec rake db:migrate:up VERSION=20220107064845
== 20220107064845 PopulateVulnerabilityReads: migrating =======================
-- Scheduled 2 PopulateVulnerabilityReads jobs with a maximum of 10000 records per batch and an interval of 120 seconds.
The migration is expected to take at least 240 seconds. Expect all jobs to have completed after 2022-01-18 16:54:17 UTC."
== 20220107064845 PopulateVulnerabilityReads: migrated (0.1012s) ==============
Migrate Down
bundle exec rake db:migrate:down VERSION=20220107064845
== 20220107064845 PopulateVulnerabilityReads: reverting =======================
== 20220107064845 PopulateVulnerabilityReads: reverted (0.0000s) ==============
Query Analysis
Query
INSERT INTO vulnerability_reads (vulnerability_id, project_id, scanner_id, report_type, severity, state, has_issues, resolved_on_default_branch, uuid, location_image)
SELECT
vulnerabilities.id,
vulnerabilities.project_id,
vulnerability_scanners.id,
vulnerabilities.report_type,
vulnerabilities.severity,
vulnerabilities.state,
CASE
WHEN
vulnerability_issue_links.vulnerability_id IS NOT NULL
THEN
true
ELSE
false
END
has_issues,
vulnerabilities.resolved_on_default_branch,
vulnerability_occurrences.uuid::uuid,
vulnerability_occurrences.location ->> 'image'
FROM
vulnerabilities
INNER JOIN
vulnerability_occurrences
on vulnerability_occurrences.vulnerability_id = vulnerabilities.id
INNER JOIN
vulnerability_scanners
on vulnerability_scanners.id = vulnerability_occurrences.scanner_id
LEFT JOIN
vulnerability_issue_links
on vulnerability_issue_links.vulnerability_id = vulnerabilities.id
WHERE vulnerabilities.id BETWEEN 100000 AND 101000
ON CONFLICT DO NOTHING;
Analysis: console.postgres.ai
Time: 933.707 ms
- planning: 5.669 ms
- execution: 928.038 ms
- I/O read: 756.987 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 21284 (~166.30 MiB) from the buffer pool
- reads: 1182 (~9.20 MiB) from the OS file cache, including disk I/O
- dirtied: 121 (~968.00 KiB)
- writes: 46 (~368.00 KiB)
Time calculation for background migration
-
Total number of records in
vulnerabilities:27121249 -
Batch size:
10,000 -
Sub-batch size:
1000 -
Total no. of batches:
27121249/10_000=2713 -
Estimated time per batch:
- No. of inserts per sub-batch: 10
-
933msfor insert with1000items - Sub-batch total:
~1 secper sub-batch - Total batch time:
1 * 10=10 secper batch
-
Time interval:
2 minutes -
Total Time:
(27121249/10000)*2~=5425 minutes~=91 hours~=3.8 days
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.