Skip to content

gitaly-ruby does not report underlying HTTP errors when proxy errors occur

Customer ticket: https://gitlab.zendesk.com/agent/tickets/174915

This issue caused a lot of grief this past weekend, and many hours were spent troubleshooting the issue. Quick summary:

  1. When trying to commit a file via the Web UI, the request would timeout with a DeadlineExceeded error. However, pushes via the command-line worked fine.
  2. The customer had https_proxy set with an HTTP, not HTTPS, proxy.
  3. As a result, all gRPC calls made by gitaly-ruby were proxied, which was unexpected.
  4. Fixing this required adding the Gitaly host IP to the no_proxy list.
  5. The proxy server itself was returning a clear error message, but this message was nowhere to be found in the Gitaly logs. We only found this in the strace output.

To reproduce this problem, set up something like:

gitaly['env'] = {
  'https_proxy' => 'http://proxy.example.com',
  'http_proxy' => 'http://proxy.example.com',
  'no_proxy' => 'gitlab.example.com'
}

Note that https_proxy has a proxy prefixed with http:// instead of https://. This has an unintended side effect of causing gitaly-ruby to proxy all gRPC calls through the proxy, even when gitaly-ruby is trying to talk to its parent Gitaly process.

If an HTTPS proxy is used instead, gRPC will ignore this setting since HTTPS proxies are not supported (https://github.com/grpc/grpc/issues/20939). Instead, you get this warning message at startup:

{
   "level":"info",
   "msg":"E1004 06:12:06.742912935    5366 http_proxy.cc:62]           'https' scheme not supported in proxy URI",
   "supervisor.args":[
      "bundle",
      "exec",
      "bin/ruby-cd",
      "/var/opt/gitlab/gitaly",
      "/opt/gitlab/embedded/service/gitaly-ruby/bin/gitaly-ruby",
      "5255",
      "/tmp/gitaly-ruby148060389/socket.1"
   ],
   "supervisor.name":"gitaly-ruby.1",
   "time":"2020-10-04T06:12:06Z"
}

Using tinyproxy and allowing all CONNECT calls, you can see which gRPC calls are being made by gitaly-ruby during the UserCommitFiles RPC:

image

This looks like it originates from https://gitlab.com/gitlab-org/gitaly/blob/5c284bc03e942ab4aed269d316c84b26b1e91aaf/ruby/lib/gitlab/git/operation_service.rb#L82.

sequenceDiagram
    autonumber
    Client->>+Rails: POST /group/project/master/README.md
    Rails->>+Gitaly: UserCommitFiles
    Gitaly->>+gitaly_ruby: UserCommitFiles
    gitaly_ruby->>proxy: CONNECT
    gitaly_ruby->>proxy: HasLocalBranches?
    proxy->>+Gitaly: HasLocalBranches?
    gitaly_ruby->>proxy: RefExists?
    proxy->>+Gitaly: RefExists?
    gitaly_ruby->>gitaly_hooks: pre-receive
    gitaly_hooks->>Rails: POST /api/v4/internal/allowed
    Rails->>gitaly_hooks: 200 OK
    gitaly_hooks->>gitaly_ruby: OK
    gitaly_ruby->>gitaly_ruby: update-ref

Step 4 is where the proxying of RPC happens, and that shouldn't be happening.

proxy.pcap

Ideas for improvement:

  1. Log the underlying gRPC error for gitaly-ruby somewhere. (this issue)
  2. Document the need to add the current host IP to the no_proxy list if an HTTP proxy is used in HTTPS.
  3. Add some Gitaly check test (similar to the gitlab-shell/bin/check) that tests for:
    1. Connectivity with the GitLab API
    2. Have gitaly-ruby make a loopback request back to the Gitaly server
Edited by Stan Hu
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information