gitaly-ruby does not report underlying HTTP errors when proxy errors occur
Customer ticket: https://gitlab.zendesk.com/agent/tickets/174915
This issue caused a lot of grief this past weekend, and many hours were spent troubleshooting the issue. Quick summary:
- When trying to commit a file via the Web UI, the request would timeout with a
DeadlineExceeded
error. However, pushes via the command-line worked fine. - The customer had
https_proxy
set with an HTTP, not HTTPS, proxy. - As a result, all gRPC calls made by
gitaly-ruby
were proxied, which was unexpected. - Fixing this required adding the Gitaly host IP to the
no_proxy
list. - The proxy server itself was returning a clear error message, but this message was nowhere to be found in the Gitaly logs. We only found this in the
strace
output.
To reproduce this problem, set up something like:
gitaly['env'] = {
'https_proxy' => 'http://proxy.example.com',
'http_proxy' => 'http://proxy.example.com',
'no_proxy' => 'gitlab.example.com'
}
Note that https_proxy
has a proxy prefixed with http://
instead of https://
. This has an unintended side effect of causing gitaly-ruby to proxy all gRPC calls through the proxy, even when gitaly-ruby is trying to talk to its parent Gitaly process.
If an HTTPS proxy is used instead, gRPC will ignore this setting since HTTPS proxies are not supported (https://github.com/grpc/grpc/issues/20939). Instead, you get this warning message at startup:
{
"level":"info",
"msg":"E1004 06:12:06.742912935 5366 http_proxy.cc:62] 'https' scheme not supported in proxy URI",
"supervisor.args":[
"bundle",
"exec",
"bin/ruby-cd",
"/var/opt/gitlab/gitaly",
"/opt/gitlab/embedded/service/gitaly-ruby/bin/gitaly-ruby",
"5255",
"/tmp/gitaly-ruby148060389/socket.1"
],
"supervisor.name":"gitaly-ruby.1",
"time":"2020-10-04T06:12:06Z"
}
Using tinyproxy
and allowing all CONNECT
calls, you can see which gRPC calls are being made by gitaly-ruby during the UserCommitFiles
RPC:
This looks like it originates from https://gitlab.com/gitlab-org/gitaly/blob/5c284bc03e942ab4aed269d316c84b26b1e91aaf/ruby/lib/gitlab/git/operation_service.rb#L82.
sequenceDiagram
autonumber
Client->>+Rails: POST /group/project/master/README.md
Rails->>+Gitaly: UserCommitFiles
Gitaly->>+gitaly_ruby: UserCommitFiles
gitaly_ruby->>proxy: CONNECT
gitaly_ruby->>proxy: HasLocalBranches?
proxy->>+Gitaly: HasLocalBranches?
gitaly_ruby->>proxy: RefExists?
proxy->>+Gitaly: RefExists?
gitaly_ruby->>gitaly_hooks: pre-receive
gitaly_hooks->>Rails: POST /api/v4/internal/allowed
Rails->>gitaly_hooks: 200 OK
gitaly_hooks->>gitaly_ruby: OK
gitaly_ruby->>gitaly_ruby: update-ref
Step 4 is where the proxying of RPC happens, and that shouldn't be happening.
Ideas for improvement:
- Log the underlying gRPC error for gitaly-ruby somewhere. (this issue)
- Document the need to add the current host IP to the
no_proxy
list if an HTTP proxy is used in HTTPS. - Add some Gitaly check test (similar to the
gitlab-shell/bin/check
) that tests for:- Connectivity with the GitLab API
- Have gitaly-ruby make a loopback request back to the Gitaly server