Skip to content

Fix repo pushes messing with initial Elasticsearch indexing

What does this MR do?

If a push is received to a project before the initial Elasticsearch indexing begins, then ElasticCommitIndexerWorker will set the project's IndexStatus to the last commit in that new push. When ElasticBatchProjectIndexerWorker finally gets to that project, it will be skipped because it will see that it already has an IndexStatus set.

To fix this, we change ElasticBatchProjectIndexerWorker to only care about IndexStatus if UPDATE_INDEX has been set. This can result in some data being indexed twice, but that is preferable (and would not result in duplicates) to having the data not indexed at all.

What are the relevant issue numbers?

  • #8013 (closed) - Race condition while indexing new projects: Turns out this fix is not enough to fix this, though I'm currently having trouble reproducing. The fix should go in another MR
  • #8628 (closed) - Repository pushes while Indexing on ElasticSearch omits data

Does this MR meet the acceptance criteria?

Closes #8628 (closed)

Edited by Coung Ngo

Merge request reports