Async tmp index addition for a `vulnerability_occurrences` backfill
What does this MR do and why?
As part of the epic to drop the
vulnerability_occurrence_pipelines
table, we need to backfill some
new columns to migrate dependent feature to
The backfill MR is scoped to nil
columns
scope_to ->(relation) { relation.where(initial_pipeline_id: nil) }
Therefore, we are making an index to get all of the vulnerability
findings (:id
) that have nil
in the
column being backfilled (:initial_pipeline_id
or
:latest_pipeline_id
)
SQL
select with filter from the background migration
SELECT
"vulnerability_occurrences"."id"
FROM
"vulnerability_occurrences"
WHERE
"vulnerability_occurrences"."id" BETWEEN 15613938 AND 15614938
AND "vulnerability_occurrences"."initial_pipeline_id" IS NULL
ORDER BY
"vulnerability_occurrences"."id" ASC
LIMIT 1000
explain
without index
- first run: https://postgres.ai/console/gitlab/gitlab-production-main/sessions/27459/commands/85518
- second run: https://postgres.ai/console/gitlab/gitlab-production-main/sessions/27459/commands/85519
explain
with index
exec CREATE INDEX on vulnerability_occurrences USING btree (id, initial_pipeline_id);
- first run: https://postgres.ai/console/gitlab/gitlab-production-main/sessions/27459/commands/85524
- second run: https://postgres.ai/console/gitlab/gitlab-production-main/sessions/27459/commands/85526
Follow-up Work
We will need to synchronously add the index, as part of the async index addition process. We also need to run the backfills and then remove the temporary indexes
That work is tracked in the following tasks:
- SYNC | Create temporary indexes to support back... (#454239 - closed) • Michael Becker • 17.0
- Backfill `initial_pipeline_id` column (#451180 - closed) • Michael Becker • 17.1
- Backfill `latest_pipeline_id` column (#451181 - closed) • Michael Becker • 17.1
- Remove temporary indexes added for the backfill (#454243 - closed) • Michael Becker • 17.3
Screenshots
psql output after running migrations locally
\d+ vulnerability_occurrences |
---|
Migration Output
bundle exec rails db:migrate
main: == [advisory_lock_connection] object_id: 122180, pg_backend_pid: 163647
main: == 20240403020614 PrepareTmpBackfillIndexForPipelineIdsToVulnerabilityOccurrences: migrating
main: -- index_exists?(:vulnerability_occurrences, [:id, :initial_pipeline_id], {:name=>"tmp_index_vulnerability_occurrences_id_and_initial_pipline_id", :where=>"initial_pipeline_id IS NULL", :algorithm=>:concurrently})
main: -> 0.0072s
main: -- add_index_options(:vulnerability_occurrences, [:id, :initial_pipeline_id], {:name=>"tmp_index_vulnerability_occurrences_id_and_initial_pipline_id", :where=>"initial_pipeline_id IS NULL", :algorithm=>:concurrently})
main: -> 0.0001s
main: -- index_exists?(:vulnerability_occurrences, [:id, :latest_pipeline_id], {:name=>"tmp_index_vulnerability_occurrences_id_and_latest_pipeline_id", :where=>"latest_pipeline_id IS NULL", :algorithm=>:concurrently})
main: -> 0.0055s
main: -- add_index_options(:vulnerability_occurrences, [:id, :latest_pipeline_id], {:name=>"tmp_index_vulnerability_occurrences_id_and_latest_pipeline_id", :where=>"latest_pipeline_id IS NULL", :algorithm=>:concurrently})
main: -> 0.0000s
main: == 20240403020614 PrepareTmpBackfillIndexForPipelineIdsToVulnerabilityOccurrences: migrated (0.0411s)
main: == [advisory_lock_connection] object_id: 122180, pg_backend_pid: 163647
ci: == [advisory_lock_connection] object_id: 122600, pg_backend_pid: 163649
ci: == 20240403020614 PrepareTmpBackfillIndexForPipelineIdsToVulnerabilityOccurrences: migrating
ci: -- index_exists?(:vulnerability_occurrences, [:id, :initial_pipeline_id], {:name=>"tmp_index_vulnerability_occurrences_id_and_initial_pipline_id", :where=>"initial_pipeline_id IS NULL", :algorithm=>:concurrently})
ci: -> 0.0069s
ci: -- add_index_options(:vulnerability_occurrences, [:id, :initial_pipeline_id], {:name=>"tmp_index_vulnerability_occurrences_id_and_initial_pipline_id", :where=>"initial_pipeline_id IS NULL", :algorithm=>:concurrently})
ci: -> 0.0004s
ci: -- index_exists?(:vulnerability_occurrences, [:id, :latest_pipeline_id], {:name=>"tmp_index_vulnerability_occurrences_id_and_latest_pipeline_id", :where=>"latest_pipeline_id IS NULL", :algorithm=>:concurrently})
ci: -> 0.0042s
ci: -- add_index_options(:vulnerability_occurrences, [:id, :latest_pipeline_id], {:name=>"tmp_index_vulnerability_occurrences_id_and_latest_pipeline_id", :where=>"latest_pipeline_id IS NULL", :algorithm=>:concurrently})
ci: -> 0.0000s
ci: == 20240403020614 PrepareTmpBackfillIndexForPipelineIdsToVulnerabilityOccurrences: migrated (0.0259s)
ci: == [advisory_lock_connection] object_id: 122600, pg_backend_pid: 163649
bundle exec rails db:migrate:down:main VERSION=20240403020614
bundle exec rails db:migrate:down:main VERSION=20240403020614
main: == [advisory_lock_connection] object_id: 121820, pg_backend_pid: 164756
main: == 20240403020614 PrepareTmpBackfillIndexForPipelineIdsToVulnerabilityOccurrences: reverting
main: == 20240403020614 PrepareTmpBackfillIndexForPipelineIdsToVulnerabilityOccurrences: reverted (0.0221s)
main: == [advisory_lock_connection] object_id: 121820, pg_backend_pid: 164756
bundle exec rails db:migrate:down:ci VERSION=20240403020614
bundle exec rails db:migrate:down:ci VERSION=20240403020614
ci: == [advisory_lock_connection] object_id: 121820, pg_backend_pid: 164349
ci: == 20240403020614 PrepareTmpBackfillIndexForPipelineIdsToVulnerabilityOccurrences: reverting
ci: == 20240403020614 PrepareTmpBackfillIndexForPipelineIdsToVulnerabilityOccurrences: reverted (0.0241s)
ci: == [advisory_lock_connection] object_id: 121820, pg_backend_pid: 164349
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
How to set up and validate locally
- follow the steps in the async index docs
Related to #422383 (closed)
Resolves #454236 (closed)