Use Backup::DatabaseConnection for index_repair
What does this MR do and why?
This change improves GitLab's database index repair functionality by adding support for bypassing PgBouncer (a database connection pooler) when performing long-running database maintenance tasks. The process is same as done in other MR for another task collation_checker: !202736 (merged)
The main improvements include:
- Documentation updates: Added instructions showing administrators how to use direct database connections instead of going through PgBouncer, which can timeout during lengthy index repair operations. This includes examples of environment variables that can be set to specify database connection details.
- Code refactoring: Modified the underlying repair index tasks to use a backup database connection system that can respect these environment variables, allowing the tasks to connect directly to PostgreSQL when needed.
- Enhanced testing: Updated the test suite to properly verify that the new connection handling works correctly across different database configurations (single database, multiple databases, etc.) and respects the dry-run mode.
The change essentially gives administrators more control over how database maintenance tasks connect to the database, helping avoid timeout issues that can occur when using PgBouncer during intensive operations. This is particularly useful for large GitLab installations where index repairs might take a long time to complete.
References
- Previous MR for another task: !202736 (merged)
- Related to Use Backup::DatabaseConnection also for repair ... (#568004 - closed)
How to set up and validate locally
-
Checkout this MR
-
Run the task:
bin/rails gitlab:db:repair_index:main, it should run without error -
Run the task again with correct ENV variable set:
$ GITLAB_BACKUP_PGUSER=bishwa bin/rails gitlab:db:repair_index:main -
To test that it respects the ENV variable, set the wrong value and assert that it fails
GITLAB_BACKUP_PGUSER=foobar bin/rails gitlab:db:repair_index:main I, [2025-09-05T10:49:53.204922 #6203] INFO -- : Running Index repair on database main... bin/rails aborted! ActiveRecord::DatabaseConnectionError: There is an issue connecting to your database with your username/password, username: foobar.
Regression Test
-
Configure
gdkfor single database modegdk config set gitlab.rails.databases.ci.enabled false gdk reconfigure -
Now run the specs and the tasks
bundle exec rspec spec/tasks/gitlab/db_rake_spec.rb bin/rails gitlab:db:repair_index -
Revert the gdk config to original
gdk config set gitlab.rails.databases.ci.enabled true gdk reconfigure -
Test
ci-connectiononly !202736 (comment 2726034152)-
Open the
vim config/database.yml -
Change the
testsection values to mimickci-connectiontest, whereciis pointing to the same database asmainbut with rake tasks disabledtest: &test main: # ... rest of test config stays the same database: gitlabhq_test # Same database name ci: # ... rest of test config stays the same database: gitlabhq_test # Same database as main (this is the key change) database_tasks: false # Keep this as false # ... rest of test config stays the same -
Run the spec:
bundle exec rspec spec/tasks/gitlab/db_rake_spec.rb
-
-
Run
gdk reconfigureto reset the manually editedconfig/database.yml
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.