Skip to content

elastic: Removing documents from the index can fail with a conflict error

Summary

Noticed during https://gitlab.com/gitlab-org/gitlab-ee/issues/11419

When removing a project or namespace from the list of elasticsearch-indexable ones, we remove all documents related to that project or namespace from the index. We also do the same in response to some git push operations.

This is important to ensure that only correct results are returned!

Steps to reproduce

  • Start with an index containing one large indexed project
  • Remove the project from the list of indexed namespaces

You can replicate this in the rails console too, e.g., my attempt on staging.gitlab.com:

project = Project.find(...)
ElasticIndexerWorker.new.perform(:delete, project.class.to_s, project.id, project.es_id, es_parent: project.es_parent)

The result was an Elasticsearch::Transport::Transport::Errors::Conflict ([409]) error:

{"took":26354,"timed_out":false,"total":217207,"deleted":52845,"batches": 53,"version_conflicts":155, ...

So that's something to be looked into. Not all the documents were removed, but many were. Re-running twice deleted more documents each time, before eventually succeeding:

{"_index"=>"gitlab-production", "_type"=>"doc", "_id"=>"project_13083", "_version"=>3, "result"=>"deleted", "_shards"=>{"total"=>2, "successful"=>2, "failed"=>0}, "_seq_no"=>826820, "_primary_term"=>1}

After that, there were still 403 documents in the index - all with a type of personal_snippet or project_snippet. That's expected.

What is the current bug behavior?

The the sidekiq job may fail with a conflict error, and some documents are left in ES. You can also replicate this in the rails console.

What is the expected correct behavior?

We should delete documents in such a way that these conflicts can't happen.