Async drop redundant index index_ci_job_variables_on_partition_id_job_id

What does this MR do and why?

This MR prepares the async removal of the index index_ci_job_variables_on_partition_id_job_id on table ci_job_variables. By dropping the index we can reclaim at least 12 GB of disk space.

It will be removed synchronously as part of the same issue #544929 (closed).

Index removal justification

We can remove:

"index_ci_job_variables_on_partition_id_job_id" btree (partition_id, job_id)

because:

  1. This index was originally introduced to support our CI partitioning efforts related to FKs. It is no longer required for this purpose. Refs #498968 (comment 2454944661), #498968 (comment 2497789884):

We added these at the beginning of the partitioning work because we had a database test that made it mandatory for each FK to have an accompanying index.

The ones from the table are droppable because we added them for FKs, but they are not needed, the FKs can delete fine without them.

  1. We already have coverage on job_id with "index_ci_job_variables_on_job_id" btree (job_id), which makes the index with partition_id redundant because:
  • Its relationship with p_ci_builds is non-unique, so we don't need a composite with partition_id to ensure uniqueness.
  • The table ci_job_variables is non-partitioned, so filtering by job_id is sufficient. Including partition_id doesn't significantly improve query performance.
  1. We do see some usage on the index in Grafana, but it's relatively low compared to the index on just job_id.
  • This low usage can be accounted for by Postgres' occasional preference for index_ci_job_variables_on_partition_id_job_id for certain queries, but those same queries should be almost as effectively supported by index_ci_job_variables_on_job_id as explained above.
  1. We don't have any queries on ci_job_variables that filter only on partition_id and not job_id.

References

How to set up and validate locally

Follow the steps in Verify indexes removed asynchronously. You may have to repeatedly run bundle exec rails gitlab:db:reindex since it only processes 2 pending async indexes at a time by default.

The following shows my local results.

Before dropping index:

gitlabhq_development=# \d+ index_ci_job_variables_on_partition_id_job_id
     Index "public.index_ci_job_variables_on_partition_id_job_id"
    Column    |  Type  | Key? |  Definition  | Storage | Stats target 
--------------+--------+------+--------------+---------+--------------
 partition_id | bigint | yes  | partition_id | plain   | 
 job_id       | bigint | yes  | job_id       | plain   | 
btree, for table "public.ci_job_variables"

After dropping index:

gitlabhq_development=# \d+ index_ci_job_variables_on_partition_id_job_id
Did not find any relation named "index_ci_job_variables_on_partition_id_job_id".

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #544929 (closed)

Edited by Leaminn Ma

Merge request reports

Loading