Lock retries in database migrations without subtransactions
We apply a so called "lock-retries" methodology in database migrations:
- DDL usually requires (explicit) locks
- Acquiring explicit locks stalls incoming traffic in the queue, from the time the lock is requested (not granted) - this can stall the site
The "lock-retries" methodology works like so:
- Open (sub-)transaction
- Set a
lock_timeout
to a low value (e.g. 100ms) - Try the migration
- If we don't get the lock within the timeout: Wait and retry with a higher
lock_timeout
This allows us to control for how long we stall the site while we try to acquire needed locks. Eventually we either give up (the migration fails) or we force-acquire the lock (we stall the site until we have gotten the lock).
Sub-transactions are only needed here if we wrap parts of the migration into lock-retries.
Proposal
Allow transactional migrations to opt-into lock-retries and apply it at the full transaction level (the whole migration).
class SomeMigration < ActiveRecord::Migration[6.0]
include Gitlab::Database::MigrationHelpers
with_lock_retries!
def up
add_foreign_key(:requirements, :projects, column: :project_id, on_delete: :cascade)
end
# ...
end
In this case, we apply the same methodology but at the outer transaction level. If we don't acquire the lock, we retry the whole migration/transaction instead of only parts of it.
- This doesn't make a difference for simple transactional migrations that only have one
with_lock_retries
block anyways - With multiple of those blocks, we can either decide to merge into one (a retry would be more expensive / more locks taken etc.) or split into multiple transactional migrations (preferrable)
- Non-transactional migrations can still make use of lock-retries, but also only with a full transaction (no subtransactions)
Edited by Andreas Brandl