Elasticsearch bulk indexer can write stale data when reading from a lagged replica
Summary
ElasticIndexBulkCronWorker can index a stale version of a record into the advanced search index when the replica it reads from has not yet applied the UPDATE that triggered the enqueue. Once the stale value is written, the Redis ZSET ref is removed and no retry occurs — the index stays stale until the next write to that record.
Steps to reproduce
Hard to reproduce on demand (depends on replica lag at the moment the bulk cron tick fires). Observed in production on work item gitlab-org/gitlab#504460 (database_id 157334722):
- User changes
milestone_idon the work item. after_commit :maintain_elasticsearch_updatefires on the writer's connection (primary), callingElastic::ProcessBookkeepingService.track!which does a Redis ZADD.ElasticIndexBulkCronWorkerruns shortly after, reads the record viaWorkItem.id_in(ids)from a replica.- If the replica has not yet applied the UPDATE from step 1, the indexer reads the prior version of the row.
- The stale row is written to Elasticsearch and the Redis ref is removed via
zremrangebyscore. No retry.
What is the current bug behavior?
Elasticsearch shows the pre-change milestone_id until the next write to the record. In the observed case it stayed stale for ~1h 53m until subsequent milestone changes re-enqueued it.
What is the expected correct behavior?
The indexer either reads the post-commit row, or detects that it read pre-commit and retries.
Relevant logs and/or screenshots
Three indexing events for database_id 157334722. search_indexing_duration_s is Time.current - record.updated_at at index time:
| Time | Event | search_indexing_duration_s |
|---|---|---|
| 2026-05-15T09:58:49.452 | track_items enqueue (WorkItem|157334722|group_9970) |
— |
| 2026-05-15T09:58:51.301 | indexing_done |
254293 (~2.94 days) |
| 2026-05-15T11:51:35.473 | track_items enqueue |
— |
| 2026-05-15T11:51:36.281 | indexing_done |
0 |
| 2026-05-15T11:51:41.983 | track_items enqueue |
— |
| 2026-05-15T11:51:43.592 | indexing_done |
1 |
The 09:58:51 read returned a record whose updated_at predated the milestone change by ~3 days (the row's previous actual update). The enqueue→read gap was ~1.85s, indicating the replica was behind by at least that much and had not yet applied the milestone UPDATE.
Possible fixes
ElasticIndexBulkCronWorker declares data_consistency :sticky, but :sticky only protects against replica lag when the job is enqueued in a session that just performed the write. Here the chain is:
write happens → after_commit → Redis ZADD (no Sidekiq enqueue)
... cron tick → schedule_shards → shard worker → DB readThe Redis ZSET does not carry the write LSN. The cron-driven shard worker is enqueued from a session with no writes, so :sticky has no LSN to wait for and degrades to replica-only with no catch-up requirement. Carrying the LSN forward through the ZSET is not practical: the ZSET member is a fixed klass|id|routing string and is what makes dedup work.
The two viable approaches:
-
Change
data_consistencyto:alwaysonElasticIndexBulkCronWorker(and the initial variant). Smallest patch, deterministically eliminates the race. Cost is that all bulk indexer DB reads go to primary; batches are small per shard so the load impact should be measurable but bounded. -
Detect and re-enqueue stale reads. After preload, compare
record.updated_atagainst a freshness threshold; if a record is suspiciously old for a ref that should reflect a recent write, log a warn and re-track!. Has two complications to solve before it can ship:- Initial indexing:
ProcessInitialBookkeepingServicebackfills records that may legitimately haveupdated_atfrom years ago. A blanket age-based check would false-positive on every initial-indexed record. Needs a way to opt out for the initial path (likely an override on the subclass). - Paused indexing: when
elasticsearch_pause_indexing?is on,track!keeps writing to Redis but the cron worker no-ops. When indexing resumes, the queue contains refs whose triggering writes may be hours old. Every read after resume would look stale to a naive freshness check. Needs the check to be aware of (or suppressed during) a post-resume window.
- Initial indexing:
Output of checks
This bug happens on GitLab.com.