Consider appending TLS certs to the system store in runner helper

Background

Recently GitLab.com rotated its SSL certs to use one signed by DigiCert to one signed by LetsEncrypt. However, as described in gitlab-com/gl-infra/production#17265 (closed), this caused a number of issues. Many users had to restart the runners for the new certs because:

  1. GitLab Runner establishes a TLS keep-alive connection with GitLab.com via Cloudflare. Before the cert change, this was using the DigiCert root.
  2. When no jobs are available, Workhorse holds the request in a long poll.
  3. When the long poll timeout finishes (50 s, set by apiCiLongPollingDuration), the Runner retries with another request, reusing the TLS connection.
  4. This repeats until a job is available for the runner.
  5. When a job becomes available, the Runner extracts the TLS certs from this keep-alive connection and builds this for CI_SERVER_TLS_CA_FILE. Unfortunately, since this connection was established at step 1, the certs are old.
  6. The Runner helper attempts to run git clone with the certs from step 5, but these don't match the new LetsEncrypt cert.
  7. The job fails, and the Runner goes back to step 2 with the existing TLS connection.

When a Runner connects to GitLab.com, the TLS connection is between Cloudflare and the Runner. Any changes in the certs don't shut down existing TLS connections.

Restarting the Runner helps because it restarts that TLS keep-alive connection, which will receive the new cert.

Proposal

In https://gitlab.com/gitlab-org/gitlab-runner/-/blob/af15566e2d9161f774f96d8a4c98d039627cb86e/shells/abstract.go#L337-348, the Runner helper always sets git config with the TLS certs derived from the connection.

This is needed when self-signed certs are used. However, in the majority of cases (especially with public sites like GitLab.com), the Runner helper already has the certificates needed to validate the host. We could consider avoiding these special SSL settings entirely.

If we do need to add self-signed certs or other untrusted certs not present in the system store, perhaps we should consider appending the certs to the trusted system certs. This would make TLS verification less brittle. For example, with Alpine:

  1. Copy the certs to /usr/local/share/ca-certificates.
  2. Run update-ca-certificates.

Alternatively, the Runner could append the self-signed certs to /etc/ssl/certs/ca-certificates.crt (in the same filename or a new one) and configure http.sslCert appropriately.