GitLab Runners use cache to speed up execution by reusing existing data. But sometimes it leads to inconsistent behaviors, and actually there is no way to force starting with a fresh copy of the cache.
We should implement a way to do that, so weird cases are easily solved using the UI.
Proposal
Add a button in CI/CD > Pipelines to clean up the cache. This works by increasing a counter in the database, and using the value of that counter to create the key for the cache. After a push, a new key is generated and the old cache is not valid anymore, and eventually garbage collector will remove it.
I have an issue with docker builds that I am not sure if it is cache related. One build fails leaving some locally modified files in git repo. All successful builds fail with the same error (see below). Is there any way to clear those local changes in the build container without cloning the repo from scratch (something like git reset --hard && git clean -df)? Our repo is ~1GB and it takes many failed builds until this whole thing is successfully cloned (previously reported in gitlab-runner#1181 (closed)).
gitlab-ci-multi-runner 1.1.0 (a23a25a)Using Docker executor with image node:5.4 ...Pulling docker image node:5.4 ...Running on runner-c31f537c-project-96-concurrent-0 via 0ee933251bb0...Fetching changes...HEAD is now at 99ca909 Merge branch 'master' into Example.comFrom https://git.colddata.com/sites/webcms2-core-mirror-colddata 0108d42..ad5138c Example.com -> origin/Example.comChecking out ad5138c4 as Epogen.com...error: Your local changes to the following files would be overwritten by checkout: dotnet/core/website/Mvc/ExampleCom/Layouts/Containers/CalloutsContainer/CalloutsContainer.cshtml dotnet/core/website/Mvc/ExampleCom/Layouts/Containers/FooterContainer/FooterContainer.cshtml dotnet/core/website/Mvc/ExampleCom/Layouts/Containers/HeaderContainer/HeaderContainer.cshtml dotnet/core/website/Mvc/ExampleCom/Layouts/Containers/IsiContainer/IsiContainer.cshtml dotnet/core/website/Mvc/ExampleCom/Layouts/Containers/MainContentContainer/MainContentContainer.cshtmlPlease, commit your changes or stash them before you can switch branches.AbortingERROR: Build failed: exit code 1
@ayufan This happens once a day now and affecting our builds. Is there a way to have runner to disregard all local changes before trying to checkout? My only workaround so far is to remove the build container and then have runner clone fresh repo.
Thank you. FYI, so far I believe this happened with the files that had different EOLs and perhaps file mode. This is a Windows project that is stored in the repo with LF line-endings. Those files cannot be reset even with git reset --hard && git clean -df.
I am now setting this in the .gitlab-ci.yml build run to see if it helps:
@demisxbar where you able to fix this? I'm having exactly the same problem using shared gitlab.com runners. I cannot workaround in any way. All builds fails due
error: Your local changes to the following files would be overwritten by checkout
And the local change is because EOL changes on checkout.
This seems to be a bug on the runner. I tried removing the builds, but same error.
Also tried creating a new branch, but the same happens.
before_script doesn't work because runs after the clone operation.
Also tired changing the option in settings from git fetch to git clone, but is the same.
@gzunino As far as I remember, the way I could fix this was to perform the build steps locally, resolve issues locally and then pushing it to origin. The runner would fail each time there were changes with EOLs or file modes and even git reset --hard couldn't get rid of those. I believe I had to rm .git/index; git reset && git add --all; git commit -m '...'. I also believe had to remove the docker build container and have runner start from scratch. Sorry, don't recall the exact details.
Thanks for the info. Unfortunately didn't work for me. Still get the same error. I don't know how to make the runner start from scratch because it's a shared runner from gitlab.com.
@gzunino Have you tried removing the docker build container(s) that runner creates? It will then rebuild it on the next run. The issue with EOL needs to be resolved locally manually in git first, though, so you don't run into it again.
As a workaround, it should be possible to manually "expire" a cache by changing its cache:key. While this would require to commit .gitlab-ci.yml, this also allows one to keep track of cache expires and to possibly reuse old caches when applicable.
Do note that expiring/excluding caches from GitLab interface still is a feature I did like to see implemented. A cache:expire_in key also would be useful.
I'm seeing an issue where the container instances for gitlab-runner-cache are not being removed, meaning over time the storage space on the server gets eaten up. I've found the script mentioned by @XenoPhage to resolve this for me.
e.g. output of docker ps
004899d9e74c 79356b9f1081 "gitlab-runner-cache " About an hour ago Exited (0) About an hour ago runner-f0002f4f-project-864-concurrent-2-cache-fe81c1caa0290d16 79356b9f1081 "gitlab-runner-cache " About an hour ago Exited (0) About an hour ago runner-f0002f4f-project-132-concurrent-0-cache-5f299e3321eb6a59 79356b9f1081 "gitlab-runner-cache " About an hour ago Exited (0) About an hour ago runner-f0002f4f-project-864-concurrent-1-cache-fe8164e67df04f6e 79356b9f1081 "gitlab-runner-cache " About an hour ago Exited (0) About an hour ago runner-f0002f4f-project-864-concurrent-0-cache-fe81
also I have found a large number of volumes listed with docker volume ls. These volumes appear to be linked to the cache container (when I tried to remove one, it stated it was in use by a gitlab-runner-cache container.)
It may be related, the volumes arn't removed because the gitlab-runner-cache isn't remove
I'm running gitlab-ci-multi-runner 1.8.0 package install on ubuntu server 16.04
@ayufan I suggest two ways of cache clearing, a keyword in the commit message, when the developer is aware of possible failures due to the cache, or in some scenarios when an MR is accepted, some integration policies prefer rerunning all the steps for stable branches !ex:[skip cache], or from the gitlab interface, a feature like Retry without cache when the problem occurs.
I upgraded a few projects from node7/npm4 to node8/npm5 and the builds failed because of a cached node_modules directory. Could only solve this by adding a new job to .gitlab-ci.yml:
The result is two extra commits per project, which could be avoided if there was a tick saying Reset cache on the New Pipeline page. A more advanced option would be a parsable trigger in the commit message, similar to how this is done for [ci skip]. This could be called like [ci reset-cache].
Sadly, a hack that I've applied does not work for cases when each branch, stage oe job has got its own cache key. A tick in GitLab UI or a working [ci reset-cache] label in a commit message would be much more useful here.
@ayufan I'm fine with that, but not sure about the UX impacts for it. Having the option when running a pipeline manually is feasible, but it's not the same as just reset the cache for the next process.
Also, who should be able to do that? Only Masters or anyone that can run a pipeline?
Maybe a button in the project settings could be an option too.
I was thinking about sending a number to the runner to indicate the used cache version, thus allowing you to clear cache on per-project basis. Instead of removing, we assume that cache is garbage collected, but we send indicator (incremental number) that we want to use a new cache. We use this number to construct on the runner the cache:key. Technically we do not clear cache, but we force all subsequent jobs to use a new / freshly build cache, as the cache:key would not match.
On Frontend we could then have a simple button: Force cache refresh, which would simply increment the cache_version in database that is send to runner.