Make pre-clone caching a built-in GitLab feature

In #39134 (closed), we added a pre-clone step to the runner that downloads a cached copy of the repository archive before attempting to run. This has had dramatic savings in response_bytes and significantly reduced CPU load on the Gitaly servers:

image

Three steps were needed:

  1. Add a pre_clone_script to the runner
  2. Set a CI/CD variable that defines the CI_PRE_CLONE_SCRIPT
  3. Add an automated CI task that uploads the archive (!21646 (merged))

Setting this up is a bit difficult, and we should make a pre-clone caching a first-class citizen.

Proposal:

  1. Just like we do with the CI cache, add configuration in the Runner to specify a location of a pre-clone cache.
  2. Add some mechanism within GitLab to upload this cache periodically (e.g. automated CI step or part of runner?)

@ayufan @tmaczukin What are your thoughts on how we might implement this?

/cc: @jlenny, @DarrenEastman

Edited Dec 12, 2019 by Stan Hu
Assignee Loading
Time tracking Loading