Outline the flow of Git push over SSH to a secondary site
Goal
In order to figure out how to proxy Git push over SSH from a Geo secondary site to a Geo primary site, we need to understand the existing flow.
Questions
- How do SSH connections get from the user into GitLab Shell for Omnibus and Charts?
- Any diagrams out there?
- Geo has https://docs.gitlab.com/ee/development/geo/proxying.html#git-push-over-ssh but I'm wondering about the details in the arrow from
Git clienttoGitLab Shell (secondary).
GET Omnibus 3k environment
- User does
git push. - User's git client spawns SSH client, calling e.g.
ssh -x git@server "git-receive-pack 'simplegit-progit.git'". See Smart Protocol > Uploading Data > SSH. - SSH client connects to GitLab external IP on port 2222 (default GitLab SSH port).
- The connection goes to a GET haproxy-external node.
On a GET haproxy-external node, in /opt/haproxy/haproxy.cfg:
frontend gitlab-ssh-in
bind *:2222
mode tcp
option tcplog
option clitcpka
default_backend gitlab-rails-ssh
backend gitlab-rails-ssh
mode tcp
option tcp-check
option httpchk GET /-/readiness
option srvtcpka
server gitlab-rails1 <IP of GitLab Rails node>:22 track gitlab-rails/gitlab-rails1
root@gitlab-haproxy-external-1:~# lsof -i -P -n | grep 2222
docker-pr 23742 root 4u IPv4 492157 0t0 TCP *:2222 (LISTEN)
- Therefore, HAProxy forwards the connection to port 22 of a GitLab Rails node.
- OpenSSH is a dependency of Omnibus GitLab
- gitlab-shell gets installed on GitLab Rails nodes
- GET modifies
/etc/ssh/sshd_configto call a custom gitlab-shell provided authorized_keys binary - Now, instead of checking the authorized_keys file, SSHD calls the binary which makes a request to the Rails app at
/api/v4/internal/authorized_keys - The Rails app responds with the key if found
- SSH considers the connection authorized
- Recall that the user's SSH client had called e.g.
git-receive-pack 'simplegit-progit.git', therefore after authorization, the server executesgit-receive-packin a non-interactive session -
git-receive-packexists in/usr/bin
root@gitlab-gitlab-rails-1:~# which git-receive-pack
/usr/bin/git-receive-pack
root@gitlab-gitlab-rails-1:~# ls -al /usr/bin/git-receive-pack
lrwxrwxrwx 1 root root 3 Oct 14 14:15 /usr/bin/git-receive-pack -> git
- I think
/usr/bin/git-receive-packwas installed by Git - I think
/opt/gitlab/embedded/service/gitlab-shell/bin/gitlab-shell git-receive-packis supposed to be called but I don't see how - When I push a new large repo, after the authorized keys check passes, while my Git client is uploading, my GitLab Rails node shows
/opt/gitlab/embedded/service/gitlab-shell/bin/gitlab-shellrunning.
# ps aux
git 25635 4.0 0.1 207932 7800 ? S 00:10 0:04 sshd: git@notty
git 25636 0.0 0.0 4632 820 ? Ss 00:10 0:00 sh -c /opt/gitlab/embedded/service/gitlab-shell/bin/gitlab-shell key-1
git 25637 8.6 0.2 1683096 20464 ? Sl 00:10 0:08 /opt/gitlab/embedded/service/gitlab-shell/bin/gitlab-shell key-1
- After the Git command is invoked in GitLab Shell, the flow is outlined in Geo proxying docs:
Git push over SSH
As SSH operations go through GitLab Shell instead of Workhorse, they are not proxied through the mechanism used for Workhorse requests. With SSH operations, they are proxied as Git HTTP requests to the primary site by the secondary Rails internal API.
sequenceDiagram
participant C as Git client
participant S as GitLab Shell (secondary)
participant I as Internal API (secondary Rails)
participant P as Primary API
C->>S: git push
S->>I: SSH key validation (api/v4/internal/authorized_keys?key=..)
I-->>S: HTTP/1.1 300 (custom action status) with {endpoint, msg, primary_repo}
S->>I: POST /api/v4/geo/proxy_git_ssh/info_refs_receive_pack
I->>P: POST $PRIMARY/foo/bar.git/info/refs/?service=git-receive-pack
P-->>I: HTTP/1.1 200 OK
I-->>S: <response>
S-->>C: return Git response from primary
C-->>S: stream Git data to push
S->>I: POST /api/v4/geo/proxy_git_ssh/receive_pack
I->>P: POST $PRIMARY/foo/bar.git/git-receive-pack
P-->>I: HTTP/1.1 200 OK
I-->>S: <response>
S-->>C: return Git response from primary
References
- Gitaly diagram of the flow of a git fetch over SSH
- GitLab SSHD
- An article outlining why we introduced GitLab SSHD, including Git fetch diagrams comparing the Omnibus GitLab OpenSSH architecture vs the new Charts GitLab SSHD architecture
- The initial MR adding GitLab SSHD
- https://docs.gitlab.com/ee/administration/operations/fast_ssh_key_lookup.html#use-gitlab-sshd-instead-of-openssh
- "Yup, I agree, we need to keep the old way, likely forever."
- Issue to add GitLab SSHD to Omnibus GitLab
- Geo proxying flow diagram for Git push over SSH
Edited by Michael Kozono