Skip to content

Resolve "Implement repository caching in www-gitlab-com pre-clone step"

NOTE: This change was reverted, it was not performant enough. We will revisit the docker approach. See details in this comment: !40665 (comment 287868133)

DESCRIPTION

For each job in the www-gitlab-com CI/CD build, the git repo clone currently takes between a minute and a half and two minutes, because it is very big and takes a lot of network time.

If we cache the git clone of master as a tarball to object storage on a regular basis, and only fetch new commits for the current pipeline branch since the last image was built, this time could be greatly reduced.

RELATED ISSUES

This approach has already been successfully implemented for the gitlab-org/gitlab repo, so all we should need to do is replicate it for this gitlab-com/www-gitlab-com repo.

We previously were going to accomplish this via building a Docker image, but it makes more sense to use the already-proven approach, which is simpler than the Docker approach anyway.

TASKS

  • Add the schedule to repo
  • Create the bucket (or reuse existing one)
  • Add CI variable with credential to bucket
  • Set up CI_PRE_CLONE_SCRIPT variable - see documentation here and required contents below (note this is not yet merged or available on live docs site)
  • Add the sync stage and job

CI_PRE_CLONE_SCRIPT variable contents:

echo "Downloading archived master..."
wget -O /tmp/www-gitlab-com-master.tar.gz https://storage.googleapis.com/gitlab-ci-git-repo-cache/project-278964/www-gitlab-com-master.tar.gz

if [ ! -f /tmp/www-gitlab-com-master.tar.gz ]; then
    echo "Repository cache not available, cloning a new directory..."
    exit
fi

rm -rf $CI_PROJECT_DIR
echo "Extracting tarball into $CI_PROJECT_DIR..."
mkdir -p $CI_PROJECT_DIR
cd $CI_PROJECT_DIR
tar xzf /tmp/www-gitlab-com-master.tar.gz
rm -f /tmp/www-gitlab-com-master.tar.gz
chmod a+w $CI_PROJECT_DIR

Closes #6511 (closed)

Edited by Chad Woolley

Merge request reports