[Index state tracking: Rollout] Support pause indexing & migration worker preflight checks
Description
All indexing requests to advanced search must respect the elasticsearch_pause_indexing setting in application settings. This setting is used to avoid data loss during maintenance tasks (like upgrades) and is a part of the zero downtime reindexing feature (uses the Elasticsearch/OpenSearch Reindex API to create a new index and move all data from one index to another).
Proposal
All AI::ActiveContext workers that perform indexing and migrations should be updated to handle settings
BulkProcessWorker
https://docs.gitlab.com/development/sidekiq/worker_attributes/#job-pause-control
- Use
pause_controlmiddleware - Create a new strategy for
ai_active_context
The should_pause? method needs check a few conditions to determine when to pause and what setting to check:
- if using setting from Advanced search (as described in #525344 (closed)), use
Gitlab::CurrentSettings.elasticsearch_pause_indexing - if active connection using Elasticsearch or OpenSearch but not Advanced search, use a new setting (not created)
- if using postgres backed, do nothing
questions to answer
- What happens if someone uses the same connection information as Advanced search but copies the data over? Should the model allow that? This creates a situation where maintenance tasks could cause data loss.
MigrationWorker
NOTE: This worker should not respect the elasticsearch_pause_indexing setting (the Elastic::MigrationWorker does not).
questions to answer
- The
Elastic::MigrationWorkerhas a few preflight checks, should this worker consume and respect them?::Gitlab::CurrentSettings.elastic_migration_worker_enabled?helper.migrations_index_exists?-
Search::Elastic::ReindexingTask.current- reindexing task
- The
Elastic::MigrationWorkerkeeps a record of the index pause state before/after a migration completes and resets it to the original state. Does this worker do that?