Add migration to fix nil location value in vulnerability_occurrences
What does this MR do and why?
Addresses #342025 (closed)
While verifying #338669 (closed), we found that there are many rows in vulnerability_occurrences
table with nil
value in location
column. This makes filtering the vulnerabilities by image
impossible.
!72224 (merged) introduced the fix to populate location
value. This MR adds background migration to populate location
from raw_metadata['location']
.
Migrate Up
$ bundle exec rake db:migrate:up VERSION=20211102103127
== 20211102103127 AddTempIndexToVulnerabilityOccurrences: migrating ===========
-- transaction_open?()
-> 0.0001s
-- index_exists?(:vulnerability_occurrences, [:id, :location], {:where=>"location IS NULL", :name=>"vulnerability_occurrences_location_temp_index", :algorithm=>:concurrently})
-> 0.0077s
-- execute("SET statement_timeout TO 0")
-> 0.0007s
-- add_index(:vulnerability_occurrences, [:id, :location], {:where=>"location IS NULL", :name=>"vulnerability_occurrences_location_temp_index", :algorithm=>:concurrently})
-> 0.1342s
-- execute("RESET statement_timeout")
-> 0.0007s
== 20211102103127 AddTempIndexToVulnerabilityOccurrences: migrated (0.1532s) ==
$ bundle exec rake db:migrate:up VERSION=20211021114401
== 20211021114401 UpdateVulnerabilityOccurrencesLocation: migrating ===========
-- Scheduled 1 UpdateVulnerabilityOccurrencesLocation jobs with a maximum of 1000 records per batch and an interval of 120 seconds.
The migration is expected to take at least 120 seconds. Expect all jobs to have completed after 2021-10-29 06:38:55 UTC."
== 20211021114401 UpdateVulnerabilityOccurrencesLocation: migrated (0.1457s) ==
Migrate Down
$ bundle exec rake db:migrate:down VERSION=20211102103127
== 20211102103127 AddTempIndexToVulnerabilityOccurrences: reverting ===========
-- transaction_open?()
-> 0.0000s
-- indexes(:vulnerability_occurrences)
-> 0.0061s
-- execute("SET statement_timeout TO 0")
-> 0.0006s
-- remove_index(:vulnerability_occurrences, {:algorithm=>:concurrently, :name=>"vulnerability_occurrences_location_temp_index"})
-> 0.0045s
-- execute("RESET statement_timeout")
-> 0.0008s
== 20211102103127 AddTempIndexToVulnerabilityOccurrences: reverted (0.0214s) ==
$ bundle exec rake db:migrate:down VERSION=20211021114401
== 20211021114401 UpdateVulnerabilityOccurrencesLocation: reverting ===========
== 20211021114401 UpdateVulnerabilityOccurrencesLocation: reverted (0.0000s) ==
Queries
From migration:
SELECT "vulnerability_occurrences"."id" FROM "vulnerability_occurrences" WHERE "vulnerability_occurrences"."location" IS NULL AND "vulnerability_occurrences"."id" >= 22295 ORDER BY "vulnerability_occurrences"."id" ASC LIMIT 1 OFFSET 1000
SELECT MIN("vulnerability_occurrences"."id"), MAX("vulnerability_occurrences"."id") FROM "vulnerability_occurrences" WHERE "vulnerability_occurrences"."location" IS NULL AND "vulnerability_occurrences"."id" >= 22295
From Background migration:
WITH cte(cte_id, cte_location) AS MATERIALIZED (VALUES
(1, '{"file":"maven/src/main/java//App.java","start_line":0,"end_line":0,"class":"com.gitlab..App","method":"insecureCypher"}'::jsonb),
(2, '{"file":"maven/src/main/java//App.java","start_line":1,"end_line":1,"class":"com.gitlab..App","method":"insecureCypher"}'::jsonb),
(3, '{"file":"maven/src/main/java//App.java","start_line":2,"end_line":2,"class":"com.gitlab..App","method":"insecureCypher"}'::jsonb),
(4, '{"file":"maven/src/main/java//App.java","start_line":3,"end_line":3,"class":"com.gitlab..App","method":"insecureCypher"}'::jsonb),
(5, '{"file":"maven/src/main/java//App.java","start_line":4,"end_line":4,"class":"com.gitlab..App","method":"insecureCypher"}'::jsonb),
(6, '{"file":"maven/src/main/java//App.java","start_line":5,"end_line":5,"class":"com.gitlab..App","method":"insecureCypher"}'::jsonb),
(7, '{"file":"maven/src/main/java//App.java","start_line":6,"end_line":6,"class":"com.gitlab..App","method":"insecureCypher"}'::jsonb),
(8, '{"file":"maven/src/main/java//App.java","start_line":7,"end_line":7,"class":"com.gitlab..App","method":"insecureCypher"}'::jsonb),
(9, '{"file":"maven/src/main/java//App.java","start_line":8,"end_line":8,"class":"com.gitlab..App","method":"insecureCypher"}'::jsonb),
(10, '{"file":"maven/src/main/java//App.java","start_line":9,"end_line":9,"class":"com.gitlab..App","method":"insecureCypher"}'::jsonb))
UPDATE vulnerability_occurrences
SET
location = cte_location
FROM
cte
WHERE
cte_id = id
Total Duration
- Batch size:
20,000
- Delay:
3.minutes
- No. of sub-batches:
500
- Total time for each sub-batch:
841.542 ms
+100 ms sleep
~=950 ms
(as mentioned in !72788 (comment 721699679)) - Total no. of sub-batches per batch: 20000/500 = 40
- Total time per batch: 40 * 950 =
38000ms
(38 seconds) - Total no. of rows with
location
nil: 12500000 - Total no. of batches with delay of 3 mins: 12500000/20000 = 625
- Total time for 625 batches: 625 * 3 = 1875 mins
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Edited by Sashi Kumar Kumaresan