Lock retries in database migrations without subtransactions

We apply a so called "lock-retries" methodology in database migrations:

DDL usually requires (explicit) locks
Acquiring explicit locks stalls incoming traffic in the queue, from the time the lock is requested (not granted) - this can stall the site

The "lock-retries" methodology works like so:

Open (sub-)transaction
Set a lock_timeout to a low value (e.g. 100ms)
Try the migration
If we don't get the lock within the timeout: Wait and retry with a higher lock_timeout

This allows us to control for how long we stall the site while we try to acquire needed locks. Eventually we either give up (the migration fails) or we force-acquire the lock (we stall the site until we have gotten the lock).

Sub-transactions are only needed here if we wrap parts of the migration into lock-retries.

Proposal

Allow transactional migrations to opt-into lock-retries and apply it at the full transaction level (the whole migration).

class SomeMigration < ActiveRecord::Migration[6.0]
  include Gitlab::Database::MigrationHelpers

  with_lock_retries!

  def up
    add_foreign_key(:requirements, :projects, column: :project_id, on_delete: :cascade)
  end

  # ...
end

In this case, we apply the same methodology but at the outer transaction level. If we don't acquire the lock, we retry the whole migration/transaction instead of only parts of it.

This doesn't make a difference for simple transactional migrations that only have one with_lock_retries block anyways
With multiple of those blocks, we can either decide to merge into one (a retry would be more expensive / more locks taken etc.) or split into multiple transactional migrations (preferrable)
Non-transactional migrations can still make use of lock-retries, but also only with a full transaction (no subtransactions)

Edited Aug 23, 2021 by Andreas Brandl