Smudge error: Error downloading...x509: certificate signed by unknown authority with LFS remote object store
When using remote object store for LFS, the runner gets
Smudge error: Error downloading...x509: certificate signed by unknown authority when trying to download objects.
When the first request - request for a new job - is done, Runner is:
- getting the TLS certificate and chain from the GitLab's response,
- trying to generate a full certificates chain based on the response data,
- attaching resolved chain to the job data,
- attaching specified client key/certificate pair (if configured) to the job data.
The chain from the job data is next exported as
GIT_SSL_CAINFO variable during job execution. This becomes the only source of truth for git's certificate validation. If the client key/certificate pair was configured, their content is also exported as
GIT_SSL_CERT variables which allows git to handle TLS Client Authorization that may be configured in front of GitLab.
The problem here is that the
GIT_SSL_CAINFO contains a CA chain that is able to verify only the certificate used by the GitLab instance, from where the job was requested. We've already found this to be problematic for cases when external submodules available through HTTPs are used.
Here we have a similar problem caused by LFS. LFS operation is handled by git. Git connects with GitLab (so the certificate at this point is verified properly), but GitLab sends a redirection to S3. Git tries next to connect to S3, but it still is limited by
GIT_SSL_CERT. Unless the S3 endpoint is signed by a certificate verified by the same CA, this will fail at certificate verification stage. And since they are using Amazon S3, then the domain (and certificate) are *.s3.amazonaws.com. While GitLab's domain is example gitlab.example.com. And this can't be verified.
The runner is using the CA for the GitLab server against the remote object store. Because of this, the job returns
Smudge error: Error downloading...x509: certificate signed by unknown authority.
The runner should use the system trusted store and not just the GitLab server CA.
Have experienced this with AWS S3
Customer ticket -> https://gitlab.zendesk.com/agent/tickets/99491 (internal use)