Skip to content

Async tmp index addition for a `vulnerability_occurrences` backfill

What does this MR do and why?

As part of the epic to drop the vulnerability_occurrence_pipelines table, we need to backfill some new columns to migrate dependent feature to

The backfill MR is scoped to nil columns

scope_to ->(relation) { relation.where(initial_pipeline_id: nil) }

Therefore, we are making an index to get all of the vulnerability findings (:id) that have nil in the column being backfilled (:initial_pipeline_id or :latest_pipeline_id)

SQL

select with filter from the background migration

SELECT
    "vulnerability_occurrences"."id"
FROM
    "vulnerability_occurrences"
WHERE
    "vulnerability_occurrences"."id" BETWEEN 15613938 AND 15614938
    AND "vulnerability_occurrences"."initial_pipeline_id" IS NULL
ORDER BY
    "vulnerability_occurrences"."id" ASC
LIMIT 1000

explain without index

explain with index

exec CREATE INDEX on vulnerability_occurrences USING btree (id, initial_pipeline_id);

Follow-up Work

We will need to synchronously add the index, as part of the async index addition process. We also need to run the backfills and then remove the temporary indexes

That work is tracked in the following tasks:

Screenshots

psql output after running migrations locally

\d+ vulnerability_occurrences
image

Migration Output

bundle exec rails db:migrate
main: == [advisory_lock_connection] object_id: 122180, pg_backend_pid: 163647
main: == 20240403020614 PrepareTmpBackfillIndexForPipelineIdsToVulnerabilityOccurrences: migrating 
main: -- index_exists?(:vulnerability_occurrences, [:id, :initial_pipeline_id], {:name=>"tmp_index_vulnerability_occurrences_id_and_initial_pipline_id", :where=>"initial_pipeline_id IS NULL", :algorithm=>:concurrently})
main:    -> 0.0072s
main: -- add_index_options(:vulnerability_occurrences, [:id, :initial_pipeline_id], {:name=>"tmp_index_vulnerability_occurrences_id_and_initial_pipline_id", :where=>"initial_pipeline_id IS NULL", :algorithm=>:concurrently})
main:    -> 0.0001s
main: -- index_exists?(:vulnerability_occurrences, [:id, :latest_pipeline_id], {:name=>"tmp_index_vulnerability_occurrences_id_and_latest_pipeline_id", :where=>"latest_pipeline_id IS NULL", :algorithm=>:concurrently})
main:    -> 0.0055s
main: -- add_index_options(:vulnerability_occurrences, [:id, :latest_pipeline_id], {:name=>"tmp_index_vulnerability_occurrences_id_and_latest_pipeline_id", :where=>"latest_pipeline_id IS NULL", :algorithm=>:concurrently})
main:    -> 0.0000s
main: == 20240403020614 PrepareTmpBackfillIndexForPipelineIdsToVulnerabilityOccurrences: migrated (0.0411s) 

main: == [advisory_lock_connection] object_id: 122180, pg_backend_pid: 163647
ci: == [advisory_lock_connection] object_id: 122600, pg_backend_pid: 163649
ci: == 20240403020614 PrepareTmpBackfillIndexForPipelineIdsToVulnerabilityOccurrences: migrating 
ci: -- index_exists?(:vulnerability_occurrences, [:id, :initial_pipeline_id], {:name=>"tmp_index_vulnerability_occurrences_id_and_initial_pipline_id", :where=>"initial_pipeline_id IS NULL", :algorithm=>:concurrently})
ci:    -> 0.0069s
ci: -- add_index_options(:vulnerability_occurrences, [:id, :initial_pipeline_id], {:name=>"tmp_index_vulnerability_occurrences_id_and_initial_pipline_id", :where=>"initial_pipeline_id IS NULL", :algorithm=>:concurrently})
ci:    -> 0.0004s
ci: -- index_exists?(:vulnerability_occurrences, [:id, :latest_pipeline_id], {:name=>"tmp_index_vulnerability_occurrences_id_and_latest_pipeline_id", :where=>"latest_pipeline_id IS NULL", :algorithm=>:concurrently})
ci:    -> 0.0042s
ci: -- add_index_options(:vulnerability_occurrences, [:id, :latest_pipeline_id], {:name=>"tmp_index_vulnerability_occurrences_id_and_latest_pipeline_id", :where=>"latest_pipeline_id IS NULL", :algorithm=>:concurrently})
ci:    -> 0.0000s
ci: == 20240403020614 PrepareTmpBackfillIndexForPipelineIdsToVulnerabilityOccurrences: migrated (0.0259s) 

ci: == [advisory_lock_connection] object_id: 122600, pg_backend_pid: 163649
bundle exec rails db:migrate:down:main VERSION=20240403020614
bundle exec rails db:migrate:down:main VERSION=20240403020614
main: == [advisory_lock_connection] object_id: 121820, pg_backend_pid: 164756
main: == 20240403020614 PrepareTmpBackfillIndexForPipelineIdsToVulnerabilityOccurrences: reverting 
main: == 20240403020614 PrepareTmpBackfillIndexForPipelineIdsToVulnerabilityOccurrences: reverted (0.0221s) 

main: == [advisory_lock_connection] object_id: 121820, pg_backend_pid: 164756
bundle exec rails db:migrate:down:ci VERSION=20240403020614
bundle exec rails db:migrate:down:ci VERSION=20240403020614
ci: == [advisory_lock_connection] object_id: 121820, pg_backend_pid: 164349
ci: == 20240403020614 PrepareTmpBackfillIndexForPipelineIdsToVulnerabilityOccurrences: reverting 
ci: == 20240403020614 PrepareTmpBackfillIndexForPipelineIdsToVulnerabilityOccurrences: reverted (0.0241s) 

ci: == [advisory_lock_connection] object_id: 121820, pg_backend_pid: 164349

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

How to set up and validate locally

  1. follow the steps in the async index docs

Related to #422383
Resolves #454236 (closed)

Edited by Michael Becker

Merge request reports