Project transfer fix for ES indexing
What does this MR do and why?
Related to #458804 (closed)
The linked issue reported problems with project data randomly going missing from the index in a dedicated instance. Specifically code/blobs
scoped data was missing, but it was not confirmed if all data was missing (meaning database backed documents like issues, merge requests, notes, etc). Debugging showed one of the projects had recently been transferred using Projects::TransferService
and that ElasticDeleteProjectWorker
had been called with that as the root caller.
I believe there is a race condition when projects are transferred within the same root level namespace:
This MR:
- introduces a new option for
ElasticDeleteProjectWorker
to only remove theproject
document and leave all associated data - uses the new option in the
Elastic::ProjectTransferWorker
- updates specs
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Screenshots or screen recordings
N/A transfer is backend process
How to set up and validate locally
Note: I had a hard time replicating the race condition because I could not get pry-shell
to work in the Elastic:ProjectTransferWorker
. These steps kind of worked to show the issue.
-
ensure advanced search is configured, enabled, and everything is indexed
-
create or find a top level group with one sub group
-
create a project inside the top level group
-
verify the projects code data is findable
-
apply the diff to revert the code to current behavior
Click to expand diff
diff --git a/ee/app/workers/elastic/project_transfer_worker.rb b/ee/app/workers/elastic/project_transfer_worker.rb index c5f80c76ddad..29522ec03f01 100644 --- a/ee/app/workers/elastic/project_transfer_worker.rb +++ b/ee/app/workers/elastic/project_transfer_worker.rb @@ -27,7 +27,9 @@ def perform(project_id, old_namespace_id, new_namespace_id) ::Elastic::ProcessInitialBookkeepingService.track!(build_document_reference(project)) ::Elastic::ProcessInitialBookkeepingService.backfill_projects!(project, skip_projects: true) - delete_old_project(project, old_namespace_id, project_only: true) + binding.pry + + delete_old_project(project, old_namespace_id, project_only: false) elsif should_invalidate_elasticsearch_indexes_cache && ::Gitlab::CurrentSettings.elasticsearch_indexing? # If the new namespace isn't indexed, the project's associated records should no longer exist in the index # and will be deleted asynchronously. Queue the project for indexing
-
Run the project transfer worker manually in rails console
project = Project.find(PROJECT_ID) subgroup = Namespace.find(SUBGROUP_ID) ::Elastic::ProjectTransferWorker.new.perform(p.id, p.namespace_id, subgroup.id)
-
when you hit the
binding.pry
, wait until all of the::Elastic::ProcessInitialBookkeepingService
indexing is complete. You can kick it off manually with::Elastic::ProcessInitialBookkeepingService.new.execute
and verify in thelog/elasticsearch.log
-
search and verify that you can find code for the project
-
continue past the
binding.pry
-
search and there will be no results <- race condition
-
Change the applied diff to only keep the
binding.pry
and re-run the same steps -
After the transfer worker is complete, the files should still be searchable