Skip to content

Populate data for vulnerability_reads from vulnerabilities

What does this MR do and why?


This MR is dependent on !75546 (merged) and only after the background migration is finished, this could be merged


Addresses #335501 (closed)

This MR adds a background migration to populate vulnerability_reads table which was introduced in !74733 (merged)

We are implementing few triggers to populate vulnerability_reads synchronously and this migration will populate the rows for older data. The triggers MR is still in review: !75230 (merged)

Database Changes

Migrate Up
bundle exec rake db:migrate:up VERSION=20220107064845  
== 20220107064845 PopulateVulnerabilityReads: migrating =======================
-- Scheduled 2 PopulateVulnerabilityReads jobs with a maximum of 10000 records per batch and an interval of 120 seconds.

The migration is expected to take at least 240 seconds. Expect all jobs to have completed after 2022-01-18 16:54:17 UTC."
== 20220107064845 PopulateVulnerabilityReads: migrated (0.1012s) ==============

Migrate Down
bundle exec rake db:migrate:down VERSION=20220107064845
== 20220107064845 PopulateVulnerabilityReads: reverting =======================
== 20220107064845 PopulateVulnerabilityReads: reverted (0.0000s) ==============

Query Analysis

Query
INSERT INTO vulnerability_reads (vulnerability_id, project_id, scanner_id, report_type, severity, state, has_issues, resolved_on_default_branch, uuid, location_image)
SELECT
   vulnerabilities.id,
   vulnerabilities.project_id,
   vulnerability_scanners.id,
   vulnerabilities.report_type,
   vulnerabilities.severity,
   vulnerabilities.state,
   CASE
      WHEN
         vulnerability_issue_links.vulnerability_id IS NOT NULL
      THEN
         true
      ELSE
         false
   END
   has_issues,
   vulnerabilities.resolved_on_default_branch,
   vulnerability_occurrences.uuid::uuid,
   vulnerability_occurrences.location ->> 'image'
FROM
   vulnerabilities
   INNER JOIN
      vulnerability_occurrences
      on vulnerability_occurrences.vulnerability_id = vulnerabilities.id
   INNER JOIN
      vulnerability_scanners
      on vulnerability_scanners.id = vulnerability_occurrences.scanner_id
   LEFT JOIN
      vulnerability_issue_links
      on vulnerability_issue_links.vulnerability_id = vulnerabilities.id
WHERE vulnerabilities.id BETWEEN 100000 AND 101000
ON CONFLICT DO NOTHING;

Analysis: console.postgres.ai

Time: 933.707 ms
  - planning: 5.669 ms
  - execution: 928.038 ms
    - I/O read: 756.987 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 21284 (~166.30 MiB) from the buffer pool
  - reads: 1182 (~9.20 MiB) from the OS file cache, including disk I/O
  - dirtied: 121 (~968.00 KiB)
  - writes: 46 (~368.00 KiB)

Time calculation for background migration

  • Total number of records in vulnerabilities: 27121249

  • Batch size: 10,000

  • Sub-batch size: 1000

  • Total no. of batches: 27121249/10_000 = 2713

  • Estimated time per batch:

    • No. of inserts per sub-batch: 10
    • 933ms for insert with 1000 items
    • Sub-batch total: ~1 sec per sub-batch
    • Total batch time: 1 * 10 = 10 sec per batch
  • Time interval: 2 minutes

  • Total Time: (27121249/10000)*2 ~= 5425 minutes ~= 91 hours ~= 3.8 days

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Sashi Kumar Kumaresan

Merge request reports

Loading