Clean up runner environment before compiling site
Today the content of a review app got deployed to the production docs site. The steps for this edge case to happen were:
- Evan triggered a review app in this MR: gitlab!100905 (merged)
- He manually triggered the review app deployment in this pipeline: https://gitlab.com/gitlab-org/gitlab/-/pipelines/665519152#/
- That triggered a pipeline in
gitlab-docs
, to build the review app itself: https://gitlab.com/gitlab-org/gitlab-docs/-/pipelines/665519697 - The content was compiled in https://gitlab.com/gitlab-org/gitlab-docs/-/jobs/3165982547 (the
compile_dev
job). This job ran on runner:#11574045 (8cwZ3F43) 4-blue.shared-gitlab-org.runners-manager.gitlab.com
- Later, a scheduled pipeline ran to deploy the docs site to production: https://gitlab.com/gitlab-org/gitlab-docs/-/pipelines/665528808
-
This pipeline's
compile_prod
: https://gitlab.com/gitlab-org/gitlab-docs/-/jobs/3166030778 -
That job ran on the exact same runner:
#11574045 (8cwZ3F43) 4-blue.shared-gitlab-org.runners-manager.gitlab.com
-
At the top of the job, you can see:
Checking out cfc150d4 as main... Removing .yarn-cache/ Removing content/_data/feature_flags.yaml Removing node_modules/ Removing public/ Removing tmp/ Removing vendor/ Skipping Git submodules setup
-
Later, in the same job:
$ bundle exec rake default INFO: Cloning https://gitlab.com/gitlab-org/gitlab.git.. fatal: destination path '../gitlab' already exists and is not an empty directory.
-
The next scheduled pipeline hit a different runner, and behaved as expected: https://gitlab.com/gitlab-org/gitlab-docs/-/jobs/3166251536, restoring the site back to matching the current default branch in gitlab
.
It turns out that the gitlab-org internal shared runners re-use the VM environment to speed up our pipelines. At the start of new jobs, the runner tries to clean up the environment (see above), but we're cloning repos outside the default working tree, and the runner is unable to clean up those cloned repos.
We need to update the rake task to handle the cases where we land on a runner that already ran a docs build, to make sure we're not reusing an old version of the docs, or a review app's content.