Skip to content

Automatically reindex tags table if collation/unique tags name not present

We've had numerous customers upgrade operating systems on PostgreSQL, resulting in index corruption due to https://docs.gitlab.com/ee/administration/postgresql/upgrading_os.html.

One of the most significant problems that results from this index corruption is that runners often don't pick up jobs that have tags. The tags table UNIQUE INDEX on name no longer functions properly.

To fix this issue, customers have done this:

  1. Run this Ruby script: $3700665. They need to run this in gitlab-rails console via load /tmp/dedupe.rb (or something of the sort).
  2. Once this is done, we can drop the duplicates in psql and reindex:
DELETE FROM tags
WHERE id NOT IN (
    SELECT MIN(id)
    FROM tags
    GROUP BY name
);
REINDEX INDEX CONCURRENTLY index_tags_on_name;
REINDEX INDEX CONCURRENTLY index_tags_on_name_trigram;

We should automate the detection of corruption of the tags.name index and provide an easy way to repair via a Rake task. We may want to automatically fix the system upon an upgrade as well.

FYI @pedropombeiro