Improve background migrations. Deprecate `bulk_queue_background_migration_jobs_by_range` & `queue_background_migration_jobs_by_range_at_intervals`
The problem
During work on https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/24512 and reviewing https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/25107, both supposing to do background migrations on large tables, it was found that bulk_queue_background_migration_jobs_by_range
& queue_background_migration_jobs_by_range_at_intervals
which are usually used have two major drawbacks:
- not scalable (see below);
- cannot work with tables where there is no PK named
id
, but there is a unique index with NOT NULL on some column (not uncommon for GitLab DB schema, due to practice to split tables, creating a new one with 1:1 relationship to the main one); - finally, to process 3 tables, 3 separate classes have to be defined – there is a lack of abstraction.
Scalability issues
bulk_queue_background_migration_jobs_by_range
& queue_background_migration_jobs_by_range_at_intervals
are not scalable from DB perspective. This thing https://gitlab.com/gitlab-org/gitlab-ce/blob/master/lib/gitlab/database/migration_helpers.rb#L978 is there to schedule N / BATCH_SIZE jobs, it's typical N+1 SELECT... Insane if you have to process 1B rows and you need to use BATCH_SIZE ~1-20k or so.
From DB perspective, batch sizes >>10k are bad (too long-lasting transactions => locks, bloat issues), and on the other hand, 1k+ of SELECTs and dozens of minutes to schedule thousand jobs is also a huge stress (e.g., to process events
in batches of 2,000 rows, it takes ~30 minutes just to schedule it – to send to sidekiq – and with size 25,000 it takes ~15-18 min). So, with row numbers >100M this thing is absolutely limited on both ends.
Capabilities of background migrations
Meanwhile, the mechanism of background migrations itself supports better way to run migrations. The migrations defined https://gitlab.com/gitlab-org/gitlab-ce/blob/master/lib/gitlab/background_migration/cleanup_concurrent_schema_change.rb shows that:
- it is possible to schedule a migration which once a next batch of data is processed, will schedule a new job, organizing a chain, getting rid of N+1 SELECTs;
- arbitrary column names might be supported;
- abstraction from table/column names might be implemented.
Proposal
The idea is to create a new thing: a helper module/class which allows to schedule just 1 single job, which will process the next available batch (retrieving BATCH_SIZE rows based on index-only scan), and then schedule the next job. Chain of jobs. Until it returns 0 rows. If proper index exists, retrieval of the "next batch" is a very efficient task, and it can involve reporting of progress (worth to log it for visibility), see example of such query: https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/24512/diffs#9125caf0371666647de544457d3390a829bc8846_0_49
It is to be run much more often than current background migrations – delays should be very small, matter of seconds or even <1s. This approach will be scalable and will allow to process up to 1B rows per day on a modern hardware.
The main challenge in defining this method is: how to provide an abstraction layer allowing to define the query or its parts, to become the new way to run background migrations, without needs to code the "whole thing" every time, minimizing the amount of code in up
methods.
If this is reached, methods bulk_queue_background_migration_jobs_by_range
& queue_background_migration_jobs_by_range_at_intervals
have to be deprecated as non-scalable, leading to poor DB performance on big tables.
Additionally: reporting for performance control (DBA)
During such background processing (which was called in Slack discussion, an "infinitely-rescheduling" migration), it is worth to control some database-related numbers:
- amount of WAL being generated (using
pg_current_xlog_location()
); - checkpoints frequency, write rates of checkpoints / backends / bgwriter (
pg_stat_bgwriter
); - sizes and size growth rates for corresponding tables and -- most important! -- indexes, to control bloat.
All these things will be extremely helpful for pre-deploy analysis of massive background migrations, helping to choose proper speed and batch sizes.