Skip to content

Add command for base certificates regeneration

NOTE THAT THIS FORK IS MAINTAINED FOR CRITICAL BUG FIXES AFFECTING RUNNING COSTS ONLY. NO OTHER CONTRIBUTIONS WILL BE ACCEPTED.

Related to https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/13795

What critical bug this MR is fixing?

How does this change help reduce cost of usage? What scale of cost reduction is it?

In what scenarios is this change usable with GitLab Runner's docker+machine executor?

It addresses the known problem related to docker+machine executor.

In short:

Docker Machine uses TLS Client certificates authentication to authenticate Docker Client requests against the Docker Engine API running on the created machine. Initially, the certificates don't exist. They are created upon the first docker-machine create call. Also in case when the certificate (CA and/or the client certificate) got outdated, they are regenerated upon the docker-machine create call.

This however creates a problem in a highly concurrent setups, which is mostly the case of docker+machine executor. As then multiple concurrently running docker-machine create commands are trying to create/recreate the keys and certificates. Which creates a race conditions and in most times causes the Runner to enter a machines creation failing loop (as the certificates can't be created or concurrent updates created certificate mismatches).

Unfortunately, there is no way to generate new or regenerate outdated certificates out of the scope of docker-machine create. There is a docker-machine regenerate-certs command, but it requires an existing machine to be used for that which is a really strange design choice.

This MR addresses that problem. It adds docker-machine regenerate-base-certs command, which uses Docker Machine internals to generate/regenerate required certificates but only the base ones (so the CA and the client certificate). It also doesn't require any existing machine to be used for that.

With this change, any orchestrating around GitLab Runner (for example - our Chef cookbook that we use to manage GitLab.com runner managers fleet) can be updated to run docker-machine regenerate-base-certs frequently. Which will have two effects:

  • on new runner manager hosts - it will generate the initial keys and certificates, after Docker Machine installation,
  • on existing runner manager hosts - it will regenerate the certificates that are already outdated.

The --regenerate-before flag allows to define an earlier regeneration than what Not After, which allows the externally managed regeneration (for example the chef-based one) to be done before the certificates are really outdated. This will prevent falling into the same race-condition with the docker-machine create calls.

Edited by Tomasz Maczukin

Merge request reports