Skip to content

Bulk indexing cron workers should respect pause setting

What does this MR do and why?

Found in preparation for upgrading Elasticsearch version in staging. Indexing was paused via application setting and the indexing queues were observed to continue draining.

The images below show staging while indexing was paused continuing to process data

source image

source image

This MR adds a check for indexing paused to the concern used by all of the bulk cron workers for indexing data. If the check is true, the cron worker will not run. I added a new test for the concern

AI summary

This change adds a new feature to pause Elasticsearch indexing. The code now checks a setting called elasticsearch_pause_indexing? and stops processing if indexing is paused. Comprehensive tests were added to verify this new functionality works correctly. The tests cover various scenarios including when indexing is enabled/disabled, paused/unpaused, and when the cluster is healthy/unhealthy. The tests also verify the worker's behavior for processing individual shards, scheduling multiple shards, handling errors, and managing locks. Additional test coverage was added for requeuing logic that determines when to schedule follow-up indexing jobs based on the number of records processed and whether there were any failures.

References

Screenshots or screen recordings

Before After

How to set up and validate locally

  1. setup gdk for elasticsearch
  2. pause indexing via application settings or rails console
  3. call queue something for indexing and run one of the bulk cron workers
  4. ensure that indexing does not run/process, monitor in log/elasticsearch.log
ApplicationSetting.update!(elasticsearch_pause_indexing: true)
Elastic::ProcessBookkeepingService.track!(*Project.last(10))
ElasticIndexBulkCronWorker.new.perform # no indexing should occur

ApplicationSetting.update!(elasticsearch_pause_indexing: true)
ElasticIndexBulkCronWorker.new.perform # indexing should occur

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Terri Chu

Merge request reports

Loading