When using TLS, 'failed to connect to all addresses' errors caused by Gitaly not trusting the cert is serves
Gitaly-Ruby will call back into the main Gitaly process over TLS when this feature is enabled. However, unless the cert has been placed in /etc/gitlab/trusted-certs
, Gitaly will not trust the cert it is serving and reject the connection with errors like:
{
"correlation_id": "yWAbgfNrIx8",
"error": "rpc error: code = Unavailable desc = failed to connect to all addresses",
"grpc.code": "Unavailable",
"grpc.meta.auth_version": "v2",
"grpc.meta.client_name": "gitlab-web",
"grpc.meta.deadline_type": "regular",
"grpc.method": "UserCommitFiles",
"grpc.request.deadline": "2020-04-16T17:09:56Z",
"grpc.request.fullMethod": "/gitaly.OperationService/UserCommitFiles",
"grpc.service": "gitaly.OperationService",
"grpc.start_time": "2020-04-16T17:09:01Z",
"grpc.time_ms": 30.62,
"level": "warning",
"msg": "finished streaming call with code Unavailable",
"peer.address": "10.128.0.188:39280",
"pid": 16514,
"span.kind": "server",
"system": "grpc",
"time": "2020-04-16T17:09:01.892Z"
}
{
"level": "info",
"msg": "E0416 17:09:01.892111283 16570 ssl_transport_security.cc:1245] Handshake failed with fatal error SSL_ERROR_SSL: error:1000007d:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED.",
"supervisor.args": [
"bundle",
"exec",
"bin/ruby-cd",
"/var/opt/gitlab/gitaly",
"/opt/gitlab/embedded/service/gitaly-ruby/bin/gitaly-ruby",
"16514",
"/var/opt/gitlab/gitaly/internal_sockets/ruby.1"
],
"supervisor.name": "gitaly-ruby.1",
"time": "2020-04-16T17:09:01.893Z"
}
Our docs direct admins to place the certs in /etc/gitlab/ssl
. The gitlab-rake gitlab:gitaly:check
task will succeed, as will most basic navigation tasks. Only when the user hits specific Gitaly-Ruby methods such as UserCommitsFiles
will the problem become apparent.
We should make Gitaly trust the cert defined in gitaly['certificate_path']
.
/cc @zj-gitlab