Cache cloned repo via Docker
NOTE: See the epic for more context on the effort to reduce repo cloning time.
UPDATE 1: This issue has been closed and superseded by a new issue to do the caching via object store rather than docker
UPDATE 2: The object-store-based caching was not performant enough, and had its own complexities. So, we are re-opening and re-visiting this approach.
UPDATE 3: Since the Docker-based approach has many complexities, and may still be slow due to the need to download a large image, we are instead investigating the approach of caching the cloned repo directly on the runners
DESCRIPTION
For each job in the www-gitlab-com
CI/CD build, the git repo clone currently takes between a minute and a half and two minutes, because it is very big and takes a lot of network time.
If we cache the git clone on a docker image which is built on a regular basis (daily or hourly), and only pull new commits since the last image was built, this time could be greatly reduced.
This approach also opens up opportunities to speed up the jobs by doing the caching of RubyGems and NPM/Yarn dependencies asynchronously in the docker image rather than via normal (time-consuming) in-job CI caching support.
ALTERNATIVES
- Can we possible cache the repo directly on the runners? Asked in Slack.
RELATED ISSUES
- The slow repo clone also seems to be causing problems with merge trains backing up
For more details, see the associated merge request