Add limit to concurrent batch exports
What does this MR do and why?
Adds a limit to concurrent batch exports.
Allows administrators to configure a maximum number of simultaneous exports (default 25) to prevent exhausting resources.
The limit isn't perfect, as there is a delay between when the check for how many are being run is made and when the status actually changes. This race condition can lead to a few more being enqueued than is set by the limit. Any fool-proof method of preventing this would add unnecessary complexity and delays in the form of locks, so I don't believe it's needed as there's no harm if a couple more jobs are enqueued than is expected.
Unlike bulk_import_concurrent_pipeline_batch_limit this is a global setting. The other one looks like it would be per export. I'd like to be consistent, but since this is a performance feature, I believe it makes more sense to be global. With bulk_import_concurrent_pipeline_batch_limit it's possible many projects or groups get imported at the same time, and we go over the limit. I can easily change this by scoping the count, so we can alter it if needed.
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Screenshots or screen recordings
Screenshots are required for UI changes, and strongly recommended for all other merge requests.
| Before | After |
|---|---|
Postgres.ai results
- https://console.postgres.ai/gitlab/gitlab-production-main/sessions/33053/commands/101894
- https://console.postgres.ai/gitlab/gitlab-production-main/sessions/33053/commands/101895
How to set up and validate locally
- Create a group and add a lot of milestones:
ms = 10000.times.map { {"title" => "Milestone #{SecureRandom.uuid}", "created_at" => Time.now} }; nil mygroup.milestones.insert_all(ms) - Update the new setting to a limit of 1 via the API:
curl --request PUT \ --url http://172.16.123.1:3000/api/v4/application/settings \ --header 'Content-Type: application/json' \ --header 'PRIVATE-TOKEN: <PRIVATE_TOKEN>' \ --data '{ "concurrent_relation_batch_export_limit": 1 }' - Export the relations for the group with many milestones:
curl --request POST \ --url 'http://172.16.123.1:3000/api/v4/groups/<GROUP_ID>/export_relations?batched=true' \ --header 'PRIVATE-TOKEN: <PRIVATE_TOKEN>' - Observe in the Sidekiq scheduled tab (http://gdk.test:3000/admin/sidekiq/scheduled) that some
RelationBatchExportWorkerhave been scheduled for 1 minute from now. (You may have to refresh a few times to see this) - Try again with the limit set to a higher number (e.g. 100) and you shouldn't see this.
Related to #473224 (closed)