Implement repository caching in GitLab pre-clone step
In https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/8407, we see that the gitlab-org/gitlab
repository is causing high CPU load on file-02 due to CI clones. Each commit can launch hundreds of builds.
The runner does pre-cache the git directories if the machine is re-used, but this doesn't work if shared runners are used.
@ayufan mentioned we could enable all runners to execute some predefined env variable via pre_clone_script
(https://docs.gitlab.com/runner/configuration/advanced-configuration.html). For example:
pre_clone_script = "eval \"$CI_PRE_CLONE_SCRIPT\""
pre_clone
is injected before the git init
.
Then we can do something like:
- Run a scheduled pipeline or a build in
prepare
phase to upload a.bundle
(ortar.gz
) to object storage. Using a tarball is significantly faster. - Set
CI_PRE_CLONE_SCRIPT
to download this bundle if available and extract it to the directory.
This assumes that having even a slightly old copy of the Git repository is better than cloning anew because there are fewer objects for the server to send and compress.
Obviously having this caching inside Gitaly would preferable, but this would at least be a short-term solution to alleviate file server load on file-02 and to see how effective this might be.
Chef changes
- Staging: https://ops.gitlab.net/gitlab-cookbooks/chef-repo/merge_requests/2310/diffs
- Prod: https://ops.gitlab.net/gitlab-cookbooks/chef-repo/merge_requests/2312/diffs
Pre-clone script
Define a CI/CD variable CI_PRE_CLONE_SCRIPT
(can't be defined in repo because we don't have a repo yet!):
echo "Downloading archived master..."
wget -O /tmp/gitlab.tar.gz https://storage.googleapis.com/gitlab-ci-bundle-cache/project-278964/gitlab-master.tar.gz
if [ ! -f /tmp/gitlab.tar.gz ]; then
echo "Repository cache not available, cloning a new directory..."
exit
fi
rm -rf $CI_PROJECT_DIR
echo "Extracting tarball into $CI_PROJECT_DIR..."
mkdir -p $CI_PROJECT_DIR
cd $CI_PROJECT_DIR
tar xzf /tmp/gitlab.tar.gz
Bundle update script
We'd need to periodically update this bundle via something like:
git clone -b master https://gitlab.com/gitlab-org/gitlab.git /tmp/gitlab
cd /tmp/gitlab
tar cvf /tmp/gitlab-master.tar .
gzip /tmp/gitlab-master.tar
/cc: @rymai
, @jramsay