WIP: Make the incremental indexing bulk size configurable
What does this MR do?
The incremental indexer runs every minute and picks a harcoded limit number of updates from a Redis queue to index in Elasticsearch. If you end up with updates exceeding this rate (currently 1000/minute) then it would ultimately fall further and further behind over time with no way to catch up and since the limit was harcoded you'd need to redeploy the application to fix the problem.
Having a hardcoded default here does not make sense as it is perfectly reasonable to have workloads that exceed 1000 updates per minute. In fact it's likely GitLab will exceed this number soon which will mean we'll perpetually fall behind even if our cluster can keep up.
Given that the incremental indexer is already taking the max bulk size in MB into account this new count limit can be safely set to quite a large number as the payloads will still be broken down further when necessary due to the size being too large.
Ideally this ProcessBookkeepingService would not really need to have a count limit at all but would just pick all jobs from the queue each time it runs ensuring to never exceed the MB limit in memory but it would just keeping picking more from Redis as necessary. This change seemed much more tricky than to just make this hardcoded number configurable and since this will be important for us to configure quite soon I went ahead and did the simplest thing first.
Screenshots
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry -
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. -
Tested in all supported browsers -
Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
-
Label as security and @ mention @gitlab-com/gl-security/appsec
-
The MR includes necessary changes to maintain consistency between UI, API, email, or other methods -
Security reports checked/validated by a reviewer from the AppSec team