Add support for taskscaler scale throttle (!4722) · Merge requests · GitLab.org / gitlab-runner · GitLab

Arran Walker requested to merge ajwalker/taskscaler-rate-limit into main Apr 17, 2024

⚠ gitlab-org/fleeting/taskscaler!45 (merged) needs to be reviewed/merged first ⚠

What does this MR do?

Throttles instance scaling (default 100 instances per second, burst limit matches max_instances).
Provides backoff should instance fail to initialise with the instance_ready_command config option.

Why was this MR needed?

There was no way to throttle instance creation unless configured at the instance group level, which is not always supported.
More importantly, if instance_ready_command fails often, there was no backoff/throttle on scaling up and idle instances could repeatedly be created fast.

What's the best way to test this MR?

Taskscaler has its own tests around the rate limiter.
We want to eventually update the test here: https://gitlab.com/gitlab-org/gitlab-runner/-/blob/77c78acd31731a0b91474d453708ce0bb86a00c5/executors/instance/instance_integration_test.go#L93-96 - but this is providing difficult until !4719 (merged) is merged and we can perhaps start capturing logrus logs to determine that we're backing off, as there's no other way to check.

Manual check:

Create runner instance executor
Set idle count to 2
Use instance_ready_command = "exit 1"

Observe each instance fail to start up and backoff occurring.

What are the relevant issue numbers?

#37473 (closed)

#37497 (closed)

Edited Apr 22, 2024 by Arran Walker