Duplicate issue documents due to an issue with es_id
Summary
There are duplicate issue documents in Elasticsearch with different _id
fields as a result of Closed issue was returned when searching open i... (#430149 - closed)
Example:
"_id": "issue_140042675"
and
"_id": "140042675"
This could cause duplicate search results but is also preventing the BackfillInitialEmbeddings
migration from finishing.
Possible fixes
We need a migration to remove issue documents with the wrong es_id
.
We might want to use:
- Search::Elastic::MigrationDatabaseBackfillHelper
- Search::Elastic::MigrationDeleteBasedOnSchemaVersion
And see why the previous ReindexAllIssues
migrations missed some of the docs.
Edited by Madelein van Niekerk