Skip to content

Reindexing strategy for PG12

Andreas Brandl requested to merge ab/reindex-concurrently into master

What does this MR do?

"Reindexing" re-creates indexes in background to reduce accumulated bloat (it's a maintenance operation). Our current strategy is based on what <= PG 11 supports: Create temporary new index with the same definition, swap and drop the original index. This doesn't work for unique/primary key indexes and other types. With PG12, we can leverage REINDEX CONCURRENTLY feature - which allows us to reindex all types of indexes.

More background in:

This MR adds a PG12 reindexing strategy using the newly introduced REINDEX CONCURRENTLY feature. It is meant to replace the now legacy method (creating a temp index and swapping indexes manually).

This is related to #329717 (closed). We introduce a new strategy here and also allow it to work on unique indexes. However, both is hidden behind a feature flag.

Change of strategy is hidden behind database_reindexing_pg12 feature flag. Rollout issue is #334372 (closed).

Notes for PG12 reindexing

  • Cancelling a concurrent reindex leaves behind INVALID index: "ci_builds_pkey_ccnew" UNIQUE, btree (id) INVALID
  • Starting another concurrent reindex after that increments a counter on the name: "ci_builds_pkey_ccnew1" UNIQUE, btree (id) INVALID
  • Locking from REINDEX CONCURRENTLY ci_builds_pkey - SHARE UPDATE EXCLUSIVE level (just like CREATE INDEX CONCURRENTLY)
           mode           |       relation       
--------------------------+----------------------
 ShareUpdateExclusiveLock | ci_builds_pkey_ccnew
 ShareUpdateExclusiveLock | ci_builds_pkey
 ExclusiveLock            | 
 ShareUpdateExclusiveLock | ci_builds
 RowExclusiveLock         | ci_builds_pkey_ccnew
(5 rows)

Logs

JSON logs and example runs

JSON log produced

{"severity":"INFO","time":"2021-06-24T09:55:19.834Z","correlation_id":null,"message":"Starting reindex of index_users_on_admin","index":"public.index_users_on_admin","table":"users","estimated_bloat_bytes":0,"index_size_before_bytes":8192}
{"severity":"INFO","time":"2021-06-24T09:55:19.861Z","correlation_id":null,"message":"Finished reindex of index_users_on_admin","index":"public.index_users_on_admin","table":"users","estimated_bloat_bytes":0,"index_size_before_bytes":8192,"index_size_after_bytes":8192,"duration_s":0.02}

Regular reindex

  TRANSACTION (0.1ms)  BEGIN /*application:web,line:/lib/gitlab/database/reindexing/reindex_action.rb:18:in `create_for'*/
   lib/gitlab/database/reindexing/reindex_action.rb:18:in `create_for'
  Gitlab::Database::Reindexing::ReindexAction Create (0.4ms)  INSERT INTO "postgres_reindex_actions" ("action_start", "ondisk_size_bytes_start", "index_identifier", "bloat_estimate_bytes_start") VALUES ('2021-06-24 09:31:00.647860', 8192, 'public.index_users_on_admin', 0) RETURNING "id" /*application:web,line:/lib/gitlab/database/reindexing/reindex_action.rb:18:in `create_for'*/
  ↳ lib/gitlab/database/reindexing/reindex_action.rb:18:in `create_for'
  TRANSACTION (11.4ms)  COMMIT /*application:web,line:/lib/gitlab/database/reindexing/reindex_action.rb:18:in `create_for'*/
   lib/gitlab/database/reindexing/reindex_action.rb:18:in `create_for'
   (0.2ms)  SET statement_timeout TO '32400s' /*application:web,line:/lib/gitlab/database/reindexing/reindex_concurrently.rb:106:in `execute'*/
  ↳ lib/gitlab/database/reindexing/reindex_concurrently.rb:106:in `execute'
   (26.1ms)  REINDEX INDEX CONCURRENTLY "public"."index_users_on_admin" /*application:web,line:/lib/gitlab/database/reindexing/reindex_concurrently.rb:106:in `execute'*/
   lib/gitlab/database/reindexing/reindex_concurrently.rb:106:in `execute'
   (0.3ms)  RESET statement_timeout /*application:web,line:/lib/gitlab/database/reindexing/reindex_concurrently.rb:106:in `execute'*/
  ↳ lib/gitlab/database/reindexing/reindex_concurrently.rb:106:in `execute'
  Gitlab::Database::PostgresIndex Load (10.8ms)  SELECT "postgres_indexes".* FROM "postgres_indexes" WHERE (name ~* 'index_users_on_admin\_ccnew[0-9]*') /*application:web,line:/lib/gitlab/database/reindexing/reindex_concurrently.rb:75:in `cleanup_dangling_indexes'*/
   lib/gitlab/database/reindexing/reindex_concurrently.rb:75:in `cleanup_dangling_indexes'
  Gitlab::Database::PostgresIndex Load (1.7ms)  SELECT "postgres_indexes".* FROM "postgres_indexes" WHERE "postgres_indexes"."identifier" = 'public.index_users_on_admin' LIMIT 1 /*application:web,line:/lib/gitlab/database/reindexing/reindex_action.rb:27:in `finish'*/
  ↳ lib/gitlab/database/reindexing/reindex_action.rb:27:in `finish'
  Gitlab::Database::PostgresIndex Load (1.6ms)  SELECT "postgres_indexes".* FROM "postgres_indexes" WHERE "postgres_indexes"."identifier" = 'public.index_users_on_admin' LIMIT 1 /*application:web,line:/lib/gitlab/database/reindexing/reindex_action.rb:27:in `finish'*/
   lib/gitlab/database/reindexing/reindex_action.rb:27:in `finish'
  TRANSACTION (0.1ms)  BEGIN /*application:web,line:/lib/gitlab/database/reindexing/reindex_action.rb:33:in `finish'*/
  ↳ lib/gitlab/database/reindexing/reindex_action.rb:33:in `finish'
  Gitlab::Database::Reindexing::ReindexAction Update (0.3ms)  UPDATE "postgres_reindex_actions" SET "action_end" = '2021-06-24 09:31:00.722602', "ondisk_size_bytes_end" = 8192, "state" = 1 WHERE "postgres_reindex_actions"."id" = 4 /*application:web,line:/lib/gitlab/database/reindexing/reindex_action.rb:33:in `finish'*/
   lib/gitlab/database/reindexing/reindex_action.rb:33:in `finish'
  TRANSACTION (2.3ms)  COMMIT /*application:web,line:/lib/gitlab/database/reindexing/reindex_action.rb:33:in `finish'*/
  ↳ lib/gitlab/database/reindexing/reindex_action.rb:33:in `finish'

Reindex with dangling _ccnew index

  TRANSACTION (0.1ms)  BEGIN /*application:web,line:/lib/gitlab/database/reindexing/reindex_action.rb:18:in `create_for'*/
   lib/gitlab/database/reindexing/reindex_action.rb:18:in `create_for'
  Gitlab::Database::Reindexing::ReindexAction Create (0.4ms)  INSERT INTO "postgres_reindex_actions" ("action_start", "ondisk_size_bytes_start", "index_identifier", "bloat_estimate_bytes_start") VALUES ('2021-06-24 09:29:30.366550', 8192, 'public.index_us
ers_on_admin', 0) RETURNING "id" /*application:web,line:/lib/gitlab/database/reindexing/reindex_action.rb:18:in `create_for'*/
  ↳ lib/gitlab/database/reindexing/reindex_action.rb:18:in `create_for'
  TRANSACTION (4.2ms)  COMMIT /*application:web,line:/lib/gitlab/database/reindexing/reindex_action.rb:18:in `create_for'*/
   lib/gitlab/database/reindexing/reindex_action.rb:18:in `create_for'
   (0.1ms)  SET statement_timeout TO '32400s' /*application:web,line:/lib/gitlab/database/reindexing/reindex_concurrently.rb:106:in `execute'*/
  ↳ lib/gitlab/database/reindexing/reindex_concurrently.rb:106:in `execute'
   (21.6ms)  REINDEX INDEX CONCURRENTLY "public"."index_users_on_admin" /*application:web,line:/lib/gitlab/database/reindexing/reindex_concurrently.rb:106:in `execute'*/
   lib/gitlab/database/reindexing/reindex_concurrently.rb:106:in `execute'
   (0.2ms)  RESET statement_timeout /*application:web,line:/lib/gitlab/database/reindexing/reindex_concurrently.rb:106:in `execute'*/
  ↳ lib/gitlab/database/reindexing/reindex_concurrently.rb:106:in `execute'
  Gitlab::Database::PostgresIndex Load (10.9ms)  SELECT "postgres_indexes".* FROM "postgres_indexes" WHERE (name ~* 'index_users_on_admin\_ccnew[0-9]*') /*application:web,line:/lib/gitlab/database/reindexing/reindex_concurrently.rb:75:in `cleanup_dangling_indexes'*/
   lib/gitlab/database/reindexing/reindex_concurrently.rb:75:in `cleanup_dangling_indexes'
   (0.1ms)  SET lock_timeout TO '60000ms' /*application:web,line:/lib/gitlab/database/with_lock_retries.rb:168:in `execute'*/
  ↳ lib/gitlab/database/with_lock_retries.rb:168:in `execute'
   (2.7ms)  DROP INDEX CONCURRENTLY IF EXISTS "public"."index_users_on_admin_ccnew" /*application:web,line:/lib/gitlab/database/reindexing/reindex_concurrently.rb:106:in `execute'*/
   lib/gitlab/database/reindexing/reindex_concurrently.rb:106:in `execute'
   (0.1ms)  RESET idle_in_transaction_session_timeout; RESET lock_timeout /*application:web,line:/lib/gitlab/database/with_lock_retries.rb:168:in `execute'*/
  ↳ lib/gitlab/database/with_lock_retries.rb:168:in `execute'

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Edited by Andreas Brandl

Merge request reports