lock retries possibly not working for transactional migrations
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
As discussed at https://gitlab.com/groups/gitlab-org/-/epics/17117#note_2726005560 we found that a migration was run and it raised an exception immediately with:
logs
StandardError: An error has occurred, this and all later migrations canceled:
PG::InFailedSqlTransaction: ERROR: current transaction is aborted, commands ignored until end of transaction block
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers/require_disable_ddl_transaction_for_multiple_locks.rb:109:in `check_current_locks'
...
Caused by:
ActiveRecord::StatementInvalid: PG::InFailedSqlTransaction: ERROR: current transaction is aborted, commands ignored until end of transaction block
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers/require_disable_ddl_transaction_for_multiple_locks.rb:109:in `check_current_locks'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers/require_disable_ddl_transaction_for_multiple_locks.rb:56:in `block in exec_migration'
...
Caused by:
PG::InFailedSqlTransaction: ERROR: current transaction is aborted, commands ignored until end of transaction block
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers/require_disable_ddl_transaction_for_multiple_locks.rb:109:in `check_current_locks'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers/require_disable_ddl_transaction_for_multiple_locks.rb:56:in `block in exec_migration'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers/loose_foreign_key_helpers.rb:17:in `track_record_deletions'
...
Caused by:
ActiveRecord::LockWaitTimeout: PG::LockNotAvailable: ERROR: canceling statement due to lock timeout
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers/loose_foreign_key_helpers.rb:17:in `track_record_deletions'
/opt/gitlab/embedded/service/gitlab-rails/db/post_migrate/20250808083909_track_merge_request_diff_deletions.rb:9:in `up'
...
Caused by:
PG::LockNotAvailable: ERROR: canceling statement due to lock timeout
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers/loose_foreign_key_helpers.rb:17:in `track_record_deletions'
/opt/gitlab/embedded/service/gitlab-rails/db/post_migrate/20250808083909_track_merge_request_diff_deletions.rb:9:in `up'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers/restrict_gitlab_schema.rb:33:in `block in exec_migration'](/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers/loose_foreign_key_helpers.rb:17:in `track_record_deletions'
/opt/gitlab/embedded/service/gitlab-rails/db/post_migrate/20250808083909_track_merge_request_diff_deletions.rb:9:in `up'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers/restrict_gitlab_schema.rb:33:in `block in exec_migration'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/query_analyzer.rb:83:in `within'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers/restrict_gitlab_schema.rb:30:in `exec_migration'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers/automatic_lock_writes_on_tables.rb:21:in `exec_migration'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers/require_disable_ddl_transaction_for_multiple_locks.rb:71:in `exec_migration'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/with_lock_retries.rb:123:in `run_block'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/with_lock_retries.rb:134:in `block in run_block_with_lock_timeout'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:127:in `public_send'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:127:in `block in write_using_load_balancer'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:145:in `block in read_write'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:232:in `retry_with_backoff'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:135:in `read_write'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:126:in `write_using_load_balancer'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:78:in `transaction'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/with_lock_retries.rb:129:in `run_block_with_lock_timeout'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/with_lock_retries.rb:97:in `run')
See the full logs at https://ops.gitlab.net/gitlab-com/gl-infra/deployer/-/jobs/19958052#L228 . This error was raised from the migration in db/post_migrate/20250808083909_track_merge_request_diff_deletions.rb .
Since this is a transactional migration it should have automatically retried these timeouts per !135808 (merged) .
Looking at the stack trace it seems like we should be rescuing the exception at https://gitlab.com/gitlab-org/gitlab/-/blob/b9eb5b6a4d001dfb2cecfa24917018812dedf8b4/lib/gitlab/database/with_lock_retries.rb#L98 but actually the code is raising a PG::LockNotAvailable instead of a ActiveRecord::LockWaitTimeout. And PG::LockNotAvailable does not seem to have ActiveRecord::LockWaitTimeout in ancestors.
So maybe we're not catching the right exceptions ever since some Rails upgrade?