[Gitlab runner] How optimised is the pipeline git clone strategy?
Problem to solve
The docs for the pipeline git clone strategy (https://docs.gitlab.com/ee/ci/yaml/README.html#git-strategy) say that clone is the slowest of the options. I want to find out if there are any optimisations under the hood going on here or if it's just a dumb clone?
On our Jenkins checkout we use a couple of strategies to get the equivalent of a clean checkout but with a significant speed boost.
Further details
What we do:
- We use a long living bare repository locally on each CI machine
- At the start of each clone, we run a fetch on this bare repo to get it up to date.
- We do a --reference-if-able clone into the working directory for the CI job, specifying the bare repo as the repo to reference. (We skip the LFS clone at this stage)
- We do something similar with git-lfs by using
config.storage
to keep a long living directory of our lfs blobs around and pointing the working directory git repo at this for thelfs pull
Proposal
If there aren't optimisations like this going on it would be great to add them (and I can detail them more here as I did a fair bit of experimentation to find the best setup) as an option for git-strategy. Maybe something like efficient-clone
?
What does success look like, and how can we measure that?
A git strategy that is:
- Faster than
clone
⏩ - Cleaner than
fetch
✨