RPC to stream a possibly-inconsistent snapshot of a git repository
Per https://gitlab.com/gitlab-org/gitlab-ce/issues/39345
For initial replication in Geo, we want to pull a snapshot of a git repository in a non-CPU-intensive way. git clone
always gives a consistent, up-to-date repository, but it can be CPU-intensive to compute.
At a minimum, this endpoint will stream a tar file containing a bare repository, containing a skeleton config
file, refs/
and packfiles
database. We may be able to get away with more or fewer items depending on experimentation.
The envisioned sequence is:
- Download and extract a tar file from gitaly
- Run
git fetch ...
- Run
git fsck
, encountering no errors
This sequence needs to be significantly faster than a plain git clone ...
for this to be worthwhile. Initial testing in https://gitlab.com/gitlab-org/gitlab-ce/issues/39345 suggests that will be the case.