Create VMs in background to speed-up the autoscaling
For the initial implementation of autoscaler
we've decided to go with the simplest way: create the VM in context of prepare
command call. While this allowed us to move forward and prepare a working test environment for ~"Shared Runners::Windows" (which in few days will bring us an open beta tests program on GitLab.com), it's not ideal from the users perspective.
With current implementation each job is longer for the time required to spin-up the VM that will handle it. While in our existing Shared Runners (powered by the docker+machine
executor) this time is mostly invisible for the user, because Runner creates VMs in advance and assigns a first free one from the pool for a newly started job. The user experience difference is that with our current Windows Shared Runners configuration we should expect job queue timings similar to what we can see for the Linux Shared Runners only when a big load is given on GitLab.com's CI.
When discussing this in the past the idea that we had to improve this was following:
-
Implement a daemon mode in
autoscaler
. With this we would startautoscaler
as a separate, long-living process in exactly same way how GitLab Runner is started. -
We should add configuration options similar to what we have in
docker+machine
executor, so: number ofidle
VMs,maximum
number of VMs andmaximum number of jobs
that a VM can handle before being removed. For the first iteration I think we can skip supportingOffPeak
versions.Together, these three options would be responsible for creating VMs in background. Managing that the
idle
number of machines are up and ready, managing the lifetime of the VMs and assigning free VMs when requested would be the task of the daemon mode. -
When autoscaling is enabled, we should change the behavior of
prepare
,run
andcleanup
commands ofautoscaler
. Instead of creating/removing the VM directly they should:- Check if
autoscaler
's daemon is available. - Connect to it and request a VM lock (for
prepare
). If there will be a ready VM,autoscaler
's daemon should chose one, block it for the usage of the given job and return immediately. If not, then it should schedule a creation which of course becomes a blocking operation (so behaves asautoscaler
works now). - Connect to it and get the VM connection details (for
run
). Execute the job on the VM as it's done now. - Connect to it and requests a VM release (for
cleanup
). Depending on autoscaling configuration this would either release the VM back to the pool of free VMs or trigger a VM removal. Anyway,cleanup
returns immediately, and the removal/release happens in the background.
- Check if
-
To communicate between
autoscaler
's commands and theatuoscaler
's daemon we should use something like gRPC.
With the above we would decouple the VMs management from job execution as much as possible. In fact, in a slightly different way it would replicate exactly how we work now with our Linux Shared Runners
Queue theory
Video resources:
- https://www.youtube.com/watch?v=66MPuv9wiIU
- https://www.youtube.com/watch?v=AsTuNP0N7DU
- https://www.therepl.net/episodes/29/
Books/Written materal: