Change repository indexing to sorted sets algorithm
What does this MR do?
This MR changes the indexation strategy for repositories and wikis.
Prior to this change, each Project
was indexed separately, using the gitlab-elasticsearch-indexer
.
With this change, we now process Projects
in batches, enabling us to leverage the Elasticsearch Bulk API to the fullest.
To achieve this, we split each project indexation operation in separate queues, which are drained by a single Cron worker.
ElasticIndexBulkCronWorker is responsible for:
elastic:bulk:initial:0:zset
elastic:incremental:updates:0:zset
(to be renamed)
ElasticIndexBulkBlobCronWorker is responsible for:
elastic:bulk:repository:initial:0:zet
elastic:bulk:repository:updates:0:zet
elastic:bulk:wiki:initial:0:zet
elastic:bulk:wiki:updates:0:zet
Screenshots
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry -
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. -
Tested in all supported browsers -
Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
-
Label as security and @ mention @gitlab-com/gl-security/appsec
-
The MR includes necessary changes to maintain consistency between UI, API, email, or other methods -
Security reports checked/validated by a reviewer from the AppSec team
Closes #205178