lock retries possibly not working for transactional migrations

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

As discussed at https://gitlab.com/groups/gitlab-org/-/epics/17117#note_2726005560 we found that a migration was run and it raised an exception immediately with:

logs
StandardError: An error has occurred, this and all later migrations canceled:
  
    PG::InFailedSqlTransaction: ERROR:  current transaction is aborted, commands ignored until end of transaction block
    /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers/require_disable_ddl_transaction_for_multiple_locks.rb:109:in `check_current_locks'
...
  
    Caused by:
    ActiveRecord::StatementInvalid: PG::InFailedSqlTransaction: ERROR:  current transaction is aborted, commands ignored until end of transaction block
    /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers/require_disable_ddl_transaction_for_multiple_locks.rb:109:in `check_current_locks'
    /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers/require_disable_ddl_transaction_for_multiple_locks.rb:56:in `block in exec_migration'
...
  
    Caused by:
    PG::InFailedSqlTransaction: ERROR:  current transaction is aborted, commands ignored until end of transaction block
    /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers/require_disable_ddl_transaction_for_multiple_locks.rb:109:in `check_current_locks'
    /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers/require_disable_ddl_transaction_for_multiple_locks.rb:56:in `block in exec_migration'
    /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers/loose_foreign_key_helpers.rb:17:in `track_record_deletions'
...
  
    Caused by:
    ActiveRecord::LockWaitTimeout: PG::LockNotAvailable: ERROR:  canceling statement due to lock timeout
    /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers/loose_foreign_key_helpers.rb:17:in `track_record_deletions'
    /opt/gitlab/embedded/service/gitlab-rails/db/post_migrate/20250808083909_track_merge_request_diff_deletions.rb:9:in `up'
...
  
    Caused by:
    PG::LockNotAvailable: ERROR:  canceling statement due to lock timeout
    /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers/loose_foreign_key_helpers.rb:17:in `track_record_deletions'
    /opt/gitlab/embedded/service/gitlab-rails/db/post_migrate/20250808083909_track_merge_request_diff_deletions.rb:9:in `up'
    /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers/restrict_gitlab_schema.rb:33:in `block in exec_migration'](/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers/loose_foreign_key_helpers.rb:17:in `track_record_deletions'
    /opt/gitlab/embedded/service/gitlab-rails/db/post_migrate/20250808083909_track_merge_request_diff_deletions.rb:9:in `up'
    /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers/restrict_gitlab_schema.rb:33:in `block in exec_migration'
    /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/query_analyzer.rb:83:in `within'
    /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers/restrict_gitlab_schema.rb:30:in `exec_migration'
    /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers/automatic_lock_writes_on_tables.rb:21:in `exec_migration'
    /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers/require_disable_ddl_transaction_for_multiple_locks.rb:71:in `exec_migration'
    /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/with_lock_retries.rb:123:in `run_block'
    /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/with_lock_retries.rb:134:in `block in run_block_with_lock_timeout'
    /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:127:in `public_send'
    /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:127:in `block in write_using_load_balancer'
    /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:145:in `block in read_write'
    /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:232:in `retry_with_backoff'
    /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:135:in `read_write'
    /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:126:in `write_using_load_balancer'
    /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:78:in `transaction'
    /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/with_lock_retries.rb:129:in `run_block_with_lock_timeout'
    /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/with_lock_retries.rb:97:in `run')

See the full logs at https://ops.gitlab.net/gitlab-com/gl-infra/deployer/-/jobs/19958052#L228 . This error was raised from the migration in db/post_migrate/20250808083909_track_merge_request_diff_deletions.rb .

Since this is a transactional migration it should have automatically retried these timeouts per !135808 (merged) .

Looking at the stack trace it seems like we should be rescuing the exception at https://gitlab.com/gitlab-org/gitlab/-/blob/b9eb5b6a4d001dfb2cecfa24917018812dedf8b4/lib/gitlab/database/with_lock_retries.rb#L98 but actually the code is raising a PG::LockNotAvailable instead of a ActiveRecord::LockWaitTimeout. And PG::LockNotAvailable does not seem to have ActiveRecord::LockWaitTimeout in ancestors.

So maybe we're not catching the right exceptions ever since some Rails upgrade?

Edited by 🤖 GitLab Bot 🤖