Skip to content

Use Backup::DatabaseConnection for index_repair

What does this MR do and why?

This change improves GitLab's database index repair functionality by adding support for bypassing PgBouncer (a database connection pooler) when performing long-running database maintenance tasks. The process is same as done in other MR for another task collation_checker: !202736 (merged)

The main improvements include:

  1. Documentation updates: Added instructions showing administrators how to use direct database connections instead of going through PgBouncer, which can timeout during lengthy index repair operations. This includes examples of environment variables that can be set to specify database connection details.
  2. Code refactoring: Modified the underlying repair index tasks to use a backup database connection system that can respect these environment variables, allowing the tasks to connect directly to PostgreSQL when needed.
  3. Enhanced testing: Updated the test suite to properly verify that the new connection handling works correctly across different database configurations (single database, multiple databases, etc.) and respects the dry-run mode.

The change essentially gives administrators more control over how database maintenance tasks connect to the database, helping avoid timeout issues that can occur when using PgBouncer during intensive operations. This is particularly useful for large GitLab installations where index repairs might take a long time to complete.

References

How to set up and validate locally

  1. Checkout this MR

  2. Run the task: bin/rails gitlab:db:repair_index:main, it should run without error

  3. Run the task again with correct ENV variable set:

    $ GITLAB_BACKUP_PGUSER=bishwa bin/rails gitlab:db:repair_index:main
  4. To test that it respects the ENV variable, set the wrong value and assert that it fails

    GITLAB_BACKUP_PGUSER=foobar bin/rails gitlab:db:repair_index:main
    I, [2025-09-05T10:49:53.204922 #6203]  INFO -- : Running Index repair on database main...
    bin/rails aborted!
    ActiveRecord::DatabaseConnectionError: There is an issue connecting to your database with your username/password, username: foobar.
Regression Test
  • Configure gdk for single database mode

    gdk config set gitlab.rails.databases.ci.enabled false
    gdk reconfigure
  • Now run the specs and the tasks

    bundle exec rspec spec/tasks/gitlab/db_rake_spec.rb
    
    bin/rails gitlab:db:repair_index
  • Revert the gdk config to original

    gdk config set gitlab.rails.databases.ci.enabled true
    gdk reconfigure
  • Test ci-connection only !202736 (comment 2726034152)

    • Open the vim config/database.yml

    • Change the test section values to mimick ci-connection test, where ci is pointing to the same database as main but with rake tasks disabled

      test: &test
        main:
          # ... rest of test config stays the same
          database: gitlabhq_test        # Same database name
        ci:
          # ... rest of test config stays the same
          database: gitlabhq_test        # Same database as main (this is the key change)
          database_tasks: false          # Keep this as false
      
      # ... rest of test config stays the same
    • Run the spec: bundle exec rspec spec/tasks/gitlab/db_rake_spec.rb

  • Run gdk reconfigure to reset the manually edited config/database.yml

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Bishwa Hang Rai

Merge request reports

Loading