GitLab runners should support clone over ssh

Release notes

Enable support for runners to clone via ssh rather than http.

Problem to solve

HTTP is easy and works great for the general user. Enterprise organizations usualy prefer to utilize SSH for cloning GIT repositories. Meanwhile, HTTP was designed for small data transfers and fails when working with extremely large datasets.

Jenkins can clone via SSH. GitHub Actions can clone over SSH. GitLab should support the same.

Although LFS is intended to help with large files in GIT, it only works when the repo lives in GitLab and the appropriate modifications are made.

Real world use case: A enterprise level customer uses some other GIT solution. They want to migrate to GitLab. Business cannot stop just to allow developers to all switch. Thus, developers commit to the other solution then GitLab is set to mirror the external repo.

We cannot configure this repo to use LFS. Cloning even at a --depth=1 is too large. Modifying caching doesn't help. Due to large binaries in the repo, a depth 1 clone is 5 gigs.

So they eventually see:

error: RPC failed; curl 18 transfer closed with outstanding read data remaining
fatal: The remote end hung up unexpectedly
fatal: early EOF
fatal: index-pack failed

If we could clone over ssh, this would not be a problem.

Intended users

User experience goal

Proposal

If we could learn GitLab's git+ssh endpoint to authorize or reject requests authenticated with JWT uniq for a job and teach GitLab to impersonate such requests with the job owner (so do what it's being done right now with git+http protocol and the ci-job-token:$CI_JOB_TOKEN used as user/pass pair), we should also be able to teach Runner to use git+ssh instead of git+http in that cases.

we could create an ephemeral SSH key. It would likely have to be generated and send by Runner, and we could dynamically auth it. I think in the past the whole reason was to not do it, due to authorized_keys , but since we use API today to check keys it seems fine. The HTTPS is generally much simpler to configure, in all cases, also for the CI.

Slack Discussion

Further details

SSH vs HTTPS .... enough said :) A few issues:

Permissions and Security

Documentation

Availability & Testing

Available Tier

What does success look like, and how can we measure that?

What is the type of buyer?

Is this a cross-stage feature?

Links / references

@tmaczukin @ayufan

Edited by 🤖 GitLab Bot 🤖