[Index state tracking: Rollout] Support pause indexing & migration worker preflight checks

Description

All indexing requests to advanced search must respect the elasticsearch_pause_indexing setting in application settings. This setting is used to avoid data loss during maintenance tasks (like upgrades) and is a part of the zero downtime reindexing feature (uses the Elasticsearch/OpenSearch Reindex API to create a new index and move all data from one index to another).

Proposal

All AI::ActiveContext workers that perform indexing and migrations should be updated to handle settings

BulkProcessWorker

https://docs.gitlab.com/development/sidekiq/worker_attributes/#job-pause-control

  • Use pause_control middleware
  • Create a new strategy for ai_active_context

The should_pause? method needs check a few conditions to determine when to pause and what setting to check:

  • if using setting from Advanced search (as described in #525344 (closed)), use Gitlab::CurrentSettings.elasticsearch_pause_indexing
  • if active connection using Elasticsearch or OpenSearch but not Advanced search, use a new setting (not created)
  • if using postgres backed, do nothing

questions to answer

  • What happens if someone uses the same connection information as Advanced search but copies the data over? Should the model allow that? This creates a situation where maintenance tasks could cause data loss.

MigrationWorker

NOTE: This worker should not respect the elasticsearch_pause_indexing setting (the Elastic::MigrationWorker does not).

questions to answer

  • The Elastic::MigrationWorker has a few preflight checks, should this worker consume and respect them?
    • ::Gitlab::CurrentSettings.elastic_migration_worker_enabled?
    • helper.migrations_index_exists?
    • Search::Elastic::ReindexingTask.current - reindexing task
  • The Elastic::MigrationWorker keeps a record of the index pause state before/after a migration completes and resets it to the original state. Does this worker do that?
Edited by Terri Chu