GitLab runners should support clone over ssh
Release notes
Enable support for runners to clone via ssh rather than http.
Problem to solve
HTTP is easy and works great for the general user. Enterprise organizations usualy prefer to utilize SSH for cloning GIT repositories. Meanwhile, HTTP was designed for small data transfers and fails when working with extremely large datasets.
Jenkins can clone via SSH. GitHub Actions can clone over SSH. GitLab should support the same.
Although LFS is intended to help with large files in GIT, it only works when the repo lives in GitLab and the appropriate modifications are made.
Real world use case: A enterprise level customer uses some other GIT solution. They want to migrate to GitLab. Business cannot stop just to allow developers to all switch. Thus, developers commit to the other solution then GitLab is set to mirror the external repo.
We cannot configure this repo to use LFS. Cloning even at a --depth=1 is too large. Modifying caching doesn't help. Due to large binaries in the repo, a depth 1 clone is 5 gigs.
So they eventually see:
error: RPC failed; curl 18 transfer closed with outstanding read data remaining
fatal: The remote end hung up unexpectedly
fatal: early EOF
fatal: index-pack failed
If we could clone over ssh, this would not be a problem.
Intended users
- Cameron (Compliance Manager)
- Delaney (Development Team Lead)
- Sasha (Software Developer)
- Devon (DevOps Engineer)
- Sidney (Systems Administrator)
- Sam (Security Analyst)
- Alex (Security Operations Engineer)
- Simone (Software Engineer in Test)
- Allison (Application Ops)
- Priyanka (Platform Engineer)
- Dana (Data Analyst)
User experience goal
Proposal
If we could learn GitLab's git+ssh endpoint to authorize or reject requests authenticated with JWT uniq for a job and teach GitLab to impersonate such requests with the job owner (so do what it's being done right now with git+http protocol and the ci-job-token:$CI_JOB_TOKEN used as user/pass pair), we should also be able to teach Runner to use git+ssh instead of git+http in that cases.
we could create an ephemeral SSH key. It would likely have to be generated and send by Runner, and we could dynamically auth it. I think in the past the whole reason was to not do it, due to authorized_keys , but since we use API today to check keys it seems fine. The HTTPS is generally much simpler to configure, in all cases, also for the CI.
Further details
SSH vs HTTPS .... enough said :) A few issues: