Improve query for description_versions backfill
What does this MR do and why?
It was found that the query planner makes suboptimal choices depending on how the description_versions table is filtered in the UPDATE query.
Problem 1: Dense ranges (original timeout)
The original query only filtered on epics.id IN (sub_batch), which gave PostgreSQL no information about which description_versions rows to target. On dense ranges (epics with many description versions), the planner chose a merge join with a full sequential scan of the description_versions.epic_id index, causing the BBM to timeout: !227413 (comment 3185132504).
Problem 2: Sparse ranges (IN (subquery) approach)
Adding a description_versions.epic_id IN (sub_batch) condition fixed dense ranges by switching the planner to nested loops. However, on sparse ranges (where few or no description versions exist), the planner chose a merge semi join that still scanned the full description_versions.epic_id index, taking ~2 minutes even when updating 0 rows.
Solution: Range conditions
Instead of using IN (subquery) on the description_versions side, we extract MIN(id) and MAX(id) from the sub-batch and apply direct range conditions:
AND description_versions.epic_id >= #{min_id}
AND description_versions.epic_id <= #{max_id}This gives the planner the bounds it needs to do a bounded index scan on description_versions.epic_id in all cases. The IN (subquery) on epics.id is retained for exact filtering.
Other changes
- Removed
AND epics.issue_id IS NOT NULLsince there is aCHECKconstraint (check_450724d1bb) ensuringissue_id IS NOT NULL - Added
.reorder(nil)tosub_batchcalls to strip theORDER BYclause appended by the keyset iterator, which would otherwise cause aPG::GroupingErrorwhen used in subqueries - Reduced batch sizes to
BATCH_SIZE = 250,SUB_BATCH_SIZE = 50, just to be extra safe
References
Screenshots or screen recordings
Not provided
How to set up and validate locally
Not provided
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Database
Before: https://console.postgres.ai/gitlab/gitlab-production-main/sessions/50341/commands/149443
After: https://console.postgres.ai/gitlab/gitlab-production-main/sessions/50341/commands/149446
Related to #454439 (closed)