Add repair index tool

What does this MR do and why?

This MR adds a new file called lib/gitlab/database/repair_index.rb. The idea is that based on a defined hash structure, it will fix the corrupted indexes by recreating it. It will also remove the duplicates by keeping the oldest ID, while also updating any references records before deleting.

It is based on the solution script commented in this thread: #372150 (comment 1083479615)

Currently, we have a simple hash constant as default: INDEXES_TO_REPAIR with known indexes.

The idea is to be able to plug new corrupted indexes in this HASH, and the repair should happen automatically. If we find any new complexity, we need to update the repair logic based on that.

The ideal goal will be that the corrupted indexes list will be generated via schema_checker or collation_checker with this defined JSON schema structure and fed into this tool to recreate the indexes automatically.

References

  1. Add a command for reindexing a system broken by... (omnibus-gitlab#6974)
  2. Database index corruption caused by a change in... (&8573)

Database

This tool is aimed at addressing the index corruption caused by collation mismatch after OS upgrade. It tries to fix unique indexes which may get corrupted if collation change and Müller and Müller gets treated as same, causing unique index corruption. As the existing index is already a unique index, I think there won't be many duplicates. The number of duplicates records so far based on customer issues are less than 100 (43 rows, 3 rows, 3 rows). We can take a guesstimate that the number would be in thousands at max. Thus, loaded all the possible duplicates in a hash/set should be better approach here to start with (Also see existing deduplicate_ci_tags implementation)

  1. Finding duplicates
    1. tags: https://console.postgres.ai/gitlab/gitlab-production-ci/sessions/41308/commands/126904
    2. merge_request_diff_commit_users: https://console.postgres.ai/gitlab/gitlab-production-main/sessions/41304/commands/126937
  2. Finding reference table records for entity_id that has both bad tag_id along with good tag_id
    1. https://console.postgres.ai/gitlab/gitlab-production-ci/sessions/41308/commands/126918
  3. Deleting reference table records that has bad tag_id record along with entity_id
    1. https://console.postgres.ai/gitlab/gitlab-production-ci/sessions/41308/commands/126924
  4. Updating any bad id reference records with good tag id
    1. https://console.postgres.ai/gitlab/gitlab-production-ci/sessions/41308/commands/126931
  5. Finding array columns on reference table with bad ids as element
    1. https://console.postgres.ai/gitlab/gitlab-production-ci/sessions/41308/commands/126982
  6. Replace bad tag id with good tag in in array column
    1. https://console.postgres.ai/gitlab/gitlab-production-ci/sessions/41308/commands/126983
  7. Delete the duplicates
    1. https://console.postgres.ai/gitlab/gitlab-production-ci/sessions/41308/commands/126934

How to set up and validate locally

  1. Real execution

    $ bundle exec rake gitlab:db:repair_index:main
    I, [2025-07-10T11:47:12.638574 #59602]  INFO -- : Running Index repair on database main...
    I, [2025-07-10T11:47:12.642720 #59602]  INFO -- : Processing index 'index_merge_request_diff_commit_users_on_name_and_email' on table 'merge_request_diff_commit_users'...
    I, [2025-07-10T11:47:12.642740 #59602]  INFO -- : Index is unique. Checking for duplicate data...
    I, [2025-07-10T11:47:12.642747 #59602]  INFO -- : Checking for duplicates in 'merge_request_diff_commit_users' for columns: name,email...
    I, [2025-07-10T11:47:12.642799 #59602]  INFO -- : SQL: SELECT ARRAY_AGG(id ORDER BY id ASC) as ids
    FROM "merge_request_diff_commit_users"
    GROUP BY "name", "email"
    HAVING COUNT(*) > 1
    
    I, [2025-07-10T11:47:12.657864 #59602]  INFO -- : No duplicates found in 'merge_request_diff_commit_users' for columns: name,email.
    I, [2025-07-10T11:47:12.657915 #59602]  INFO -- : SQL: SELECT 1 FROM pg_indexes
    WHERE indexname = 'index_merge_request_diff_commit_users_on_name_and_email'
    AND tablename = 'merge_request_diff_commit_users'
    
    I, [2025-07-10T11:47:12.665283 #59602]  INFO -- : Index exists. Reindexing...
    I, [2025-07-10T11:47:12.665298 #59602]  INFO -- : Reindexing index 'index_merge_request_diff_commit_users_on_name_and_email'...
    I, [2025-07-10T11:47:12.665331 #59602]  INFO -- : SQL: REINDEX INDEX CONCURRENTLY "index_merge_request_diff_commit_users_on_name_and_email"
    I, [2025-07-10T11:47:12.671953 #59602]  INFO -- : Index reindexed successfully.
    I, [2025-07-10T11:47:12.671964 #59602]  INFO -- : Processing index 'index_merge_request_diff_commit_users_on_org_id_name_email' on table 'merge_request_diff_commit_users'...
    I, [2025-07-10T11:47:12.671969 #59602]  INFO -- : Index is unique. Checking for duplicate data...
    I, [2025-07-10T11:47:12.671975 #59602]  INFO -- : Checking for duplicates in 'merge_request_diff_commit_users' for columns: organization_id,name,email...
    I, [2025-07-10T11:47:12.672015 #59602]  INFO -- : SQL: SELECT ARRAY_AGG(id ORDER BY id ASC) as ids
    FROM "merge_request_diff_commit_users"
    GROUP BY "organization_id", "name", "email"
    HAVING COUNT(*) > 1
    
    I, [2025-07-10T11:47:12.677315 #59602]  INFO -- : No duplicates found in 'merge_request_diff_commit_users' for columns: organization_id,name,email.
    I, [2025-07-10T11:47:12.677359 #59602]  INFO -- : SQL: SELECT 1 FROM pg_indexes
    WHERE indexname = 'index_merge_request_diff_commit_users_on_org_id_name_email'
    AND tablename = 'merge_request_diff_commit_users'
    
    I, [2025-07-10T11:47:12.682984 #59602]  INFO -- : Index exists. Reindexing...
    I, [2025-07-10T11:47:12.683002 #59602]  INFO -- : Reindexing index 'index_merge_request_diff_commit_users_on_org_id_name_email'...
    I, [2025-07-10T11:47:12.683036 #59602]  INFO -- : SQL: REINDEX INDEX CONCURRENTLY "index_merge_request_diff_commit_users_on_org_id_name_email"
    I, [2025-07-10T11:47:12.690942 #59602]  INFO -- : Index reindexed successfully.
    I, [2025-07-10T11:47:12.690956 #59602]  INFO -- : Index repair completed for database main.
    
  2. Dry Run, with DRY_RUN=true

    $ bundle exec rake gitlab:db:repair_index:main DRY_RUN=true
    I, [2025-07-10T11:55:23.710765 #60564]  INFO -- : DRY RUN: Analysis only, no changes will be made.
    I, [2025-07-10T11:55:23.710784 #60564]  INFO -- : Running Index repair on database main...
    I, [2025-07-10T11:55:23.714121 #60564]  INFO -- : Processing index 'index_merge_request_diff_commit_users_on_name_and_email' on table 'merge_request_diff_commit_users'...
    I, [2025-07-10T11:55:23.714139 #60564]  INFO -- : Index is unique. Checking for duplicate data...
    I, [2025-07-10T11:55:23.714148 #60564]  INFO -- : Checking for duplicates in 'merge_request_diff_commit_users' for columns: name,email...
    I, [2025-07-10T11:55:23.714241 #60564]  INFO -- : SQL: SELECT ARRAY_AGG(id ORDER BY id ASC) as ids
    FROM "merge_request_diff_commit_users"
    GROUP BY "name", "email"
    HAVING COUNT(*) > 1
    
    I, [2025-07-10T11:55:23.722854 #60564]  INFO -- : No duplicates found in 'merge_request_diff_commit_users' for columns: name,email.
    I, [2025-07-10T11:55:23.722888 #60564]  INFO -- : SQL: SELECT 1 FROM pg_indexes
    WHERE indexname = 'index_merge_request_diff_commit_users_on_name_and_email'
    AND tablename = 'merge_request_diff_commit_users'
    
    I, [2025-07-10T11:55:23.727670 #60564]  INFO -- : Index exists. Reindexing...
    I, [2025-07-10T11:55:23.727679 #60564]  INFO -- : Reindexing index 'index_merge_request_diff_commit_users_on_name_and_email'...
    I, [2025-07-10T11:55:23.727700 #60564]  INFO -- : SQL: REINDEX INDEX CONCURRENTLY "index_merge_request_diff_commit_users_on_name_and_email"
    I, [2025-07-10T11:55:23.727704 #60564]  INFO -- : Index reindexed successfully.
    I, [2025-07-10T11:55:23.727709 #60564]  INFO -- : Processing index 'index_merge_request_diff_commit_users_on_org_id_name_email' on table 'merge_request_diff_commit_users'...
    I, [2025-07-10T11:55:23.727713 #60564]  INFO -- : Index is unique. Checking for duplicate data...
    I, [2025-07-10T11:55:23.727717 #60564]  INFO -- : Checking for duplicates in 'merge_request_diff_commit_users' for columns: organization_id,name,email...
    I, [2025-07-10T11:55:23.727755 #60564]  INFO -- : SQL: SELECT ARRAY_AGG(id ORDER BY id ASC) as ids
    FROM "merge_request_diff_commit_users"
    GROUP BY "organization_id", "name", "email"
    HAVING COUNT(*) > 1
    
    I, [2025-07-10T11:55:23.737390 #60564]  INFO -- : No duplicates found in 'merge_request_diff_commit_users' for columns: organization_id,name,email.
    I, [2025-07-10T11:55:23.737449 #60564]  INFO -- : SQL: SELECT 1 FROM pg_indexes
    WHERE indexname = 'index_merge_request_diff_commit_users_on_org_id_name_email'
    AND tablename = 'merge_request_diff_commit_users'
    
    I, [2025-07-10T11:55:23.745271 #60564]  INFO -- : Index exists. Reindexing...
    I, [2025-07-10T11:55:23.745280 #60564]  INFO -- : Reindexing index 'index_merge_request_diff_commit_users_on_org_id_name_email'...
    I, [2025-07-10T11:55:23.745303 #60564]  INFO -- : SQL: REINDEX INDEX CONCURRENTLY "index_merge_request_diff_commit_users_on_org_id_name_email"
    I, [2025-07-10T11:55:23.745307 #60564]  INFO -- : Index reindexed successfully.
    I, [2025-07-10T11:55:23.745311 #60564]  INFO -- : Index repair completed for database main.

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Bishwa Hang Rai

Merge request reports

Loading