Skip to content

Resolve "elasticsearch: wiki is not correctly indexed on import"

What does this MR do?

Issue #207491 (closed)

Findings

The two worker calls to ElasticCommitIndexerWorker (one for project repository and one for wiki repository) were initiated from the Project creation hooks in ProjectSearch - maintain_elasticsearch_create. That works fine for projects created in the new project workflow as the repositories are empty initially. During an import, the project is created before the repository/wiki repository import after finished and the calls happen too quickly. Additionally, we need to take care to not introduce unneeded index calls for the wiki when mirror repositories are updated as wikis are not part of the repository updates.

Proposed fix

Add a call to index the Wiki Repository after Project Import has transitioned to finished state, but do not run it for mirrored projects. Mirrored projects create a scheduled ProjectImportService as part of the update for mirrors set to Pull. Verified the fix with unit tests and manual testing in gdk.

The original fix proposed was not the best spot for repository indexing during imports. After some research and local testing, the Wiki Indexing was moved to the after_import method on the Project class. This method does not appear to be called when UpdateMirrorWorker runs, but is run when all other imports are run.

I ran some manual testing locally:

Description Expected Actual
GitLab import (tarball) Wiki indexed Wiki indexed via ElasticCommitIndexerWorker
GitHub import (access token) Wiki indexed Wiki indexed via ElasticCommitIndexerWorker
Mirror repository update Wiki not indexed Wiki not indexed
Fork repository Wiki not indexed Wiki not indexed

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Added unit tests to validate the new code works with mirror projects and non-mirror projects.

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • [-] Label as security and @ mention @gitlab-com/gl-security/appsec
  • [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • [-] Security reports checked/validated by a reviewer from the AppSec team
Edited by 🤖 GitLab Bot 🤖

Merge request reports