Skip to content

GitLab Next

Why GitLab
Pricing
Contact Sales
Explore

Sign in
Get free trial

WIP: Make the incremental indexing bulk size configurable

Review changes
Download
Patches
Plain diff

Dylan Griffith requested to merge configurable-bulk-indexer-size into master Apr 01, 2020

Overview 12
Commits 1
Pipelines 1
Changes 9

What does this MR do?

The incremental indexer runs every minute and picks a harcoded limit number of updates from a Redis queue to index in Elasticsearch. If you end up with updates exceeding this rate (currently 1000/minute) then it would ultimately fall further and further behind over time with no way to catch up and since the limit was harcoded you'd need to redeploy the application to fix the problem.

Having a hardcoded default here does not make sense as it is perfectly reasonable to have workloads that exceed 1000 updates per minute. In fact it's likely GitLab will exceed this number soon which will mean we'll perpetually fall behind even if our cluster can keep up.

Given that the incremental indexer is already taking the max bulk size in MB into account this new count limit can be safely set to quite a large number as the payloads will still be broken down further when necessary due to the size being too large.

Ideally this ProcessBookkeepingService would not really need to have a count limit at all but would just pick all jobs from the queue each time it runs ensuring to never exceed the MB limit in memory but it would just keeping picking more from Redis as necessary. This change seemed much more tricky than to just make this hardcoded number configurable and since this will be important for us to configure quite soon I went ahead and did the simplest thing first.

Screenshots

Does this MR meet the acceptance criteria?

Conformity

Changelog entry
Documentation (if required)
Code review guidelines
Merge request performance guidelines
Style guides
Database guides
Separation of EE specific content

Availability and Testing

Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process.
Tested in all supported browsers
Informed Infrastructure department of a default or new setting change, if applicable per definition of done

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

Label as security and @ mention @gitlab-com/gl-security/appsec
The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
Security reports checked/validated by a reviewer from the AppSec team

Edited May 31, 2022 by 🤖 GitLab Bot 🤖

Merge request reports

Assignee

Select assignees

Reviewers

Request review from

Time tracking