Add support for taskscaler scale throttle
What does this MR do?
- Throttles instance scaling (default 100 instances per second, burst limit matches
max_instances
). - Provides backoff should instance fail to initialise with the
instance_ready_command
config option.
Why was this MR needed?
- There was no way to throttle instance creation unless configured at the instance group level, which is not always supported.
- More importantly, if
instance_ready_command
fails often, there was no backoff/throttle on scaling up and idle instances could repeatedly be created fast.
What's the best way to test this MR?
- Taskscaler has its own tests around the rate limiter.
- We want to eventually update the test here: https://gitlab.com/gitlab-org/gitlab-runner/-/blob/77c78acd31731a0b91474d453708ce0bb86a00c5/executors/instance/instance_integration_test.go#L93-96 - but this is providing difficult until !4719 (merged) is merged and we can perhaps start capturing logrus logs to determine that we're backing off, as there's no other way to check.
Manual check:
- Create runner
instance
executor - Set idle count to 2
- Use
instance_ready_command = "exit 1"
Observe each instance fail to start up and backoff occurring.
What are the relevant issue numbers?
Edited by Arran Walker