Skip to content

Reschedule work_item_type backfill on issues

What does this MR do and why?

Reschedules a background migration we already run before in !71869 (merged). There was a lot of discussion in the last MR that introduced the migration, so I'll try to sum up some of the concerns and how we resolved them in order to end up with the code we have:

  1. We introduced a temp index tmp_index_issues_on_issue_type_and_id on issues (issue_type, id).
  2. We removed the index index_issues_on_work_item_type_id on issues (work_item_type_id) for the foreign key temporarily as it's not used and would prevent HOT updates.
  3. We have a retry mechanism since most of the time the queries would be fast enough not to timeout, but running sample batches for hours showed that some queries would take a large amount of seconds randomly and this was attributed to the large GIN indexes we have on the issues table. No better solution for the moment.
  4. The outer batch will not scope records with WHERE work_item_type_id IS NULL since the issue types a very spread out and might timeout since we cannot create a filter using WHERE (work_item_type_id IS NULL) (would also prevent HOT updates).
  5. Filtering by WHERE work_item_type_id IS NULL in the inner batches shouldn't be a problem since they are smaller and will be fast enough. Keeping the null condition will avoid overwrites.
  6. Downside of having the null condition on the inner bathes and not in the outer batches is that we will have many bathes that have 0 inner batches (since we already run the backfill migration in the past).
  7. This MR schedules 5 different background migrations (one for each work item type in the DB). Only the first one issue_type will take long since the others will use the tmp index to go through a relatively small amount of records.

We needed to reschedule as some parts of the code were still inserting issue records without a work_item_type_id and bypassing the ActiveRecord model. All these places in the code have been fixed, so we can reschedule this migration before validating the not null constraint later.

DB Review

Migration Output

UP

main: == 20220830172142 RescheduleIssueWorkItemTypeIdBackfill: migrating ============
main: == 20220830172142 RescheduleIssueWorkItemTypeIdBackfill: migrated (0.1703s) ===

ci: == 20220830172142 RescheduleIssueWorkItemTypeIdBackfill: migrating ============
ci: -- The migration is skipped since it modifies the schemas: [:gitlab_main].
ci: -- This database can only apply migrations in one of the following schemas: [:gitlab_ci, :gitlab_shared, :gitlab_internal].
ci: == 20220830172142 RescheduleIssueWorkItemTypeIdBackfill: migrated (0.0001s) ===

DOWN

main: == 20220830172142 RescheduleIssueWorkItemTypeIdBackfill: reverting ============
main: == 20220830172142 RescheduleIssueWorkItemTypeIdBackfill: reverted (0.0334s) ===

ci: == 20220830172142 RescheduleIssueWorkItemTypeIdBackfill: reverting ============
ci: -- The migration is skipped since it modifies the schemas: [:gitlab_main].
ci: -- This database can only apply migrations in one of the following schemas: [:gitlab_ci, :gitlab_shared, :gitlab_internal].
ci: == 20220830172142 RescheduleIssueWorkItemTypeIdBackfill: reverted (0.0001s) ===

Query Plans

Plans for the first batch starting a NULL work_item_type_id was found

https://console.postgres.ai/shared/eeae4abb-4812-43f0-afac-ebf8045aa103
SELECT 
  "issues"."id" 
FROM 
  "issues" 
WHERE 
  "issues"."id" BETWEEN 105442002 
  AND 105446050 
  AND "issues"."issue_type" = 0 
  AND "issues"."work_item_type_id" IS NULL 
ORDER BY 
  "issues"."id" ASC 
LIMIT 
  1
https://console.postgres.ai/shared/4d1317f3-5070-4769-8650-acb1d5402aff
SELECT 
  "issues"."id" 
FROM 
  "issues" 
WHERE 
  "issues"."id" BETWEEN 105442002 
  AND 105446050 
  AND "issues"."issue_type" = 0 
  AND "issues"."work_item_type_id" IS NULL 
  AND "issues"."id" >= 4 
ORDER BY 
  "issues"."id" ASC 
LIMIT 
  1 OFFSET 100
https://console.postgres.ai/shared/b9792bc6-63c4-4714-a058-41e579e6f140
UPDATE 
  "issues" 
SET 
  "work_item_type_id" = 1, 
  "lock_version" = COALESCE("lock_version", 0) + 1 
WHERE 
  "issues"."issue_type" = 0 
  AND "issues"."work_item_type_id" IS NULL 
  AND "issues"."id" BETWEEN 105442002 
  AND 105442390

Time Estimates

More detail in #338004 (comment 1083833984)

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #338004 (closed)

Edited by Mario Celi

Merge request reports