Add a temporary partial index for migrating db/migrate/20180420010616_cleanup_build_stage_migration.rb
In https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/18778, we added a migration that is generally fast if there are few rows, but can hit a statement timeout if there are many rows that match SELECT id FROM ci_builds where stage_id IS NULL
.
In particular, the code in https://gitlab.com/gitlab-org/gitlab-ce/blob/5b0e96d09a7b6309384ef4c300c56872adba0626/app/models/concerns/each_batch.rb#L46-49 causes this query to run:
SELECT "ci_builds"."id" FROM "ci_builds" WHERE (stage_id IS NULL) ORDER BY "ci_builds"."id" ASC LIMIT 1
The ORDER BY hits statement timeouts because the database has to load all IDs and sort.
Staging had over 200,000 rows matching in a total of 34+ million rows.
Production only had 143 rows matching in 64+ million rows.
For production, this migration will be fast. For staging, it fails due to a statement timeout (which is 60 seconds in staging, 15 in production).
We worked around the issue by adding a partial index in https://gitlab.com/gitlab-com/infrastructure/issues/4176#note_72228837:
CREATE INDEX CONCURRENTLY tmp_test_index_null_stage ON ci_builds (id) WHERE (stage_id IS NULL);
This allowed the migration to proceed, although it appears that it will take an hour or so to finish.
It's probably a good idea to create this index and drop it when the migration is done scheduling.
/cc: @grzesiek, @abrandl, @mayra-cabrera, @meks