Reduce Runner loop busy work

The main runner loop at the moment does this:

  • Continuously feed runner configs into runners channel.
  • Workers (based on the concurrency value) read from the runners channel.
  • A worker does the following:
    • provider.Acquire() to acquire "executor data".
    • buildsHelper.acquireBuild to ensure that we don't exceed the runner.Limit for this Runner.
    • processBuildOnRunner():
      • createSession() sets up a session server.
      • requestJob():
        • buildsHelper.acquireRequest() to ensure that don't exceed the inflight request limit for this Runner (request_concurrency)
        • doJobRequest() sends the HTTP request for a job.

Putting it another way, for each potential job we process we do the following:

  • Check to see if the executor is ready (costly)
  • Check to see if we're at the runner limit (cheap)
  • Create a session server (costly)
  • Check to see if we're at the request limit (cheap)

Now, in most configurations, request_concurrency is 1. The request to GitLab.com is long-polling, which means it waits for a job for 60 seconds.

For any config that has a concurrency > 1, this typically means that for 60 seconds whilst performing a long-polling request, in the background we're constantly pinging the executor to see if it has the capacity for a job, creating a session for the job and then throwing all of that away because we're already at the request limit.

This could be solved by re-ordering the operations:

  • Check to see if we're at the runner limit (cheap)
  • Check to see if we're at the request limit (cheap)
  • Check to see if the executor is ready (costly)
  • Create a session server (costly)