Skip to content

Detecting duplicate btree indexes

Omar Qunsul requested to merge poc-detect-duplicated-indexes into master

What does this MR do and why?

Detect duplicate indexes.

Related Issue: #385701 (closed)

Based on the discussion on the related issue:

  1. We will create issues for each team that is owning a table so that they remove the duplicate indexes.
  2. Once this MR is merged, any duplicate index will fail the tests, so that it requires teams to remove the old index in the same MR.

Steps to validate locally

  1. Create a duplicate index on abuse_report_events, in the TEST environment:
gdk psql -d "gitlabhq_test" -c "CREATE INDEX index_abuse_report_events_on_user_id_and_abuse_report_id ON abuse_report_events(user_id, abuse_report_id);"
  1. Run this test, which should fail by complaining about a duplicate index
./bin/rspec spec/db/schema_spec.rb --example 'for table abuse_report_events btree indexes does not have any duplicated indexes'

The expected error:

       Duplicate index: index_abuse_report_events_on_user_id_and_abuse_report_id with ["index_abuse_report_events_on_user_id"]
       index_abuse_report_events_on_user_id_and_abuse_report_id : [{:name=>"user_id", :order=>:asc}, {:name=>"abuse_report_id", :order=>:asc}]
       index_abuse_report_events_on_user_id : [{:name=>"user_id", :order=>:asc}].
       Consider dropping the indexes index_abuse_report_events_on_user_id
  1. Clean up. Drop the index and run the test again (step 2). It should succeed
gdk psql -d "gitlabhq_test" -c "DROP INDEX index_abuse_report_events_on_user_id_and_abuse_report_id;"

Potential Indexes Storage Savings

  • On Main around 51 GBs. here (internal thanos link)
  • On CI around 189 GBs. here (internal thanos link)

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Omar Qunsul

Merge request reports