Smudge error: Error downloading...x509: certificate signed by unknown authority with LFS remote object store
Summary
When using remote object store for LFS, the runner gets Smudge error: Error downloading...x509: certificate signed by unknown authority when trying to download objects.
When the first request - request for a new job - is done, Runner is:
- getting the TLS certificate and chain from the GitLab's response,
- trying to generate a full certificates chain based on the response data,
- attaching resolved chain to the job data,
- attaching specified client key/certificate pair (if configured) to the job data.
The chain from the job data is next exported as GIT_SSL_CAINFO variable during job execution. This becomes the only source of truth for git's certificate validation. If the client key/certificate pair was configured, their content is also exported as GIT_SSL_KEY and GIT_SSL_CERT variables which allows git to handle TLS Client Authorization that may be configured in front of GitLab.
The problem here is that the GIT_SSL_CAINFO contains a CA chain that is able to verify only the certificate used by the GitLab instance, from where the job was requested. We've already found this to be problematic for cases when external submodules available through HTTPs are used.
Here we have a similar problem caused by LFS. LFS operation is handled by git. Git connects with GitLab (so the certificate at this point is verified properly), but GitLab sends a redirection to S3. Git tries next to connect to S3, but it still is limited by GIT_SSL_CERT. Unless the S3 endpoint is signed by a certificate verified by the same CA, this will fail at certificate verification stage. And since they are using Amazon S3, then the domain (and certificate) are *.s3.amazonaws.com. While GitLab's domain is example gitlab.example.com. And this can't be verified.
Actual behavior
The runner is using the CA for the GitLab server against the remote object store. Because of this, the job returns Smudge error: Error downloading...x509: certificate signed by unknown authority.
Expected behavior
The runner should use the system trusted store and not just the GitLab server CA.
Environment description
Have experienced this with AWS S3
Customer ticket -> https://gitlab.zendesk.com/agent/tickets/99491 (internal use)