Skip to content

Geo: Fix timeouts when pushing via SSH to secondary

Currently the code for proxying SSH pushes to secondaries lives inside Rails. It is more ideally suited for Workhorse however.

https://gitlab.com/gitlab-org/gitlab-ee/issues/6533

Current workflow for pushing on a secondary described below.

Push over HTTP(s)

  1. the info/refs request, on the secondary: GET /root/push.git/info/refs?service=git-receive-pack HTTP/1.1 302

    Gets a 302 to "location":"http://unified.url/-/push_from_secondary/2/root/push.git/info/refs"

  2. redirected info/refs request, hits the secondary and gets proxied to the primary in Workhorse (through Geo secondary proxying): GET /-/push_from_secondary/2/root/push.git/info/refs?service=git-receive-pack HTTP/1.1 401

    a. unauthenticated first (401)

    b. authenticated afterwards (200)

Note: with unified URLs, the redirected URL in step 1 is still going to hit the secondary (if on a secondary), but it'll be proxied to the primary

  1. git-receive-pack push, hits the secondary, gets proxied to the primary in Workhorse: POST /-/push_from_secondary/2/root/push.git/git-receive-pack HTTP/1.1 200
Verbose log from the client perspective
14:04:35.713313 http.c:623              => Send header: GET /root/push.git/info/refs?service=git-receive-pack HTTP/1.1
14:04:35.713325 http.c:623              => Send header: Host: secondary.tld
...
14:04:35.807510 http.c:623              <= Recv header: HTTP/1.1 302 Found
14:04:35.807617 http.c:623              <= Recv header: location: http://unified.tld/-/push_from_secondary/2/root/push.git/info/refs?service=git-receive-pack
...
14:04:35.915135 http.c:623              => Send header: GET /-/push_from_secondary/2/root/push.git/info/refs?service=git-receive-pack HTTP/1.1
14:04:35.915147 http.c:623              => Send header: Host: unified.tld
...
14:04:36.015615 http.c:623              <= Recv header: HTTP/1.1 401 Unauthorized
14:04:43.209457 http.c:664              == Info: Server auth using Basic with user 'root'
14:04:43.209571 http.c:623              => Send header: GET /-/push_from_secondary/2/root/push.git/info/refs?service=git-receive-pack HTTP/1.1
14:04:43.209583 http.c:623              => Send header: Host: unified.tld
14:04:43.209593 http.c:623              => Send header: Authorization: Basic <redacted>
...
14:04:43.563351 http.c:623              <= Recv header: HTTP/1.1 200 OK
14:04:43.563398 http.c:623              <= Recv header: content-type: application/x-git-receive-pack-advertisement
...
warning: redirecting to http://unified.tld/-/push_from_secondary/2/root/push.git/
14:04:43.564571 run-command.c:654       trace: run_command: git send-pack --stateless-rpc --helper-status --thin --progress http://unified.tld/-/push_from_secondary/2/root/push.git/ --stdin
Enumerating objects: 5, done.
Counting objects: 100% (5/5), done.
Delta compression using up to 16 threads
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 278 bytes | 278.00 KiB/s, done.
Total 3 (delta 1), reused 0 (delta 0), pack-reused 0
14:04:43.576303 http.c:664              == Info: Server auth using Basic with user 'root'
14:04:43.576436 http.c:623              => Send header: POST /-/push_from_secondary/2/root/push.git/git-receive-pack HTTP/1.1
14:04:43.576447 http.c:623              => Send header: Host: unified.tld
14:04:43.576453 http.c:623              => Send header: Authorization: Basic <redacted>
14:04:43.576489 http.c:623              => Send header: Accept: application/x-git-receive-pack-result
...
14:04:43.942470 http.c:623              <= Recv header: HTTP/1.1 200 OK
14:04:43.942563 http.c:623              <= Recv header: content-type: application/x-git-receive-pack-result
remote:
remote: This request to a Geo secondary node will be forwarded to the
remote: Geo primary node:
remote:
remote:   http://unified.tld/root/push.git
remote:
remote:
To http://secondary.tld/root/push.git
   d99395a..fa2c992  main -> main

Push over SSH

  1. SSH Key authorization check in /api/v4/internal/authorized_keys?key=[FILTERED]

  2. /api/v4/internal/allowed with a 300 status for custom actions:

    {
      "endpoint": "/api/v4/geo/proxy_git_ssh/info_refs_receive_pack",
      "msg": "customaction: processApiEndpoints: Performing custom action",
      "primary_repo": "http://primary.geo-gitlab-test.ml/root/push.git",
    }
  3. POST "/api/v4/geo/proxy_git_ssh/info_refs_receive_pack" with the info/refs on the secondary to the internal API, Gitlab::Geo::GitSSHProxy in Rails handles the proxy to the primary of info refs

  4. POST "/api/v4/geo/proxy_git_ssh/receive_pack" with the receive pack on the secondary to the internal API, Gitlab::Geo::GitSSHProxy handles this here as well in Rails

All output is Base64 encoded (https://gitlab.com/gitlab-org/gitlab/blob/b056b9fb37910673ae7c690191b8aaec3d677d56/ee/lib/gitlab/geo/git_ssh_proxy.rb#L29) before sending to the primary

More details on the push logic from gitlab-workhorse!320 (closed):

  1. git push to secondary from command line
  2. check_custom_action(cmd) (https://gitlab.com/gitlab-org/gitlab-ee/blob/master/lib/gitlab/git_access.rb#L68) is examined
  3. For a valid Geo setup, https://gitlab.com/gitlab-org/gitlab-ee/blob/master/ee/lib/ee/gitlab/geo_git_access.rb#L34-47 is executed which returns a JSON payload to gitlab-shell with some endpoints to POST to in sequence - ["/api/v4/geo/proxy_git_push_ssh/info_refs", "/api/v4/geo/proxy_git_push_ssh/push"]
  4. Each endpoint is POST'd at the secondary node, with any output being send to the following endpoint (e.g. output from /api/v4/geo/proxy_git_push_ssh/info_refs is sent POST'd to /api/v4/geo/proxy_git_push_ssh/push)
  5. For each /api/v4/geo/proxy_git_push_ssh/* API endpoint on the secondary node, logic for info_refs (https://gitlab.com/gitlab-org/gitlab-ee/blob/master/ee/lib/gitlab/geo/git_push_ssh_proxy.rb#L57) and push (https://gitlab.com/gitlab-org/gitlab-ee/blob/master/ee/lib/gitlab/geo/git_push_ssh_proxy.rb#L71) is executed.
Edited by Catalin Irimie