Review pipeline cache policy
We currently cache several directories for our frontend tooling across all pipelines in the GitLab CE/EE repos.
These directories accumulate compiled intermediate states for transpilers like babel and various webpack loaders. The problem is that these cache directories are never pruned or maintained. When run locally, they are generally stored in /tmp/*
where they are wiped out when the system reboots. Since we are caching them in our CI system, they are never wiped out and will continue to just grow indefinitely.
I suggest that we implement some sort of cache purge policy that triggers once every few weeks to ensure our caches don't balloon out of control.
@ayufan is there currently any mechanism to wipe out caches on a defined schedule, or would we need to implement something manually? I'm not actually sure how best to do this, because if we used some CI script to do this, it would not be an atomic operation and the purge could easily be undone by another concurrent pipeline which saves and uploads its own cache afterword, restoring what was just deleted.
Also, are file timestamps preserved within cache archives? Could we use them to determine the age of some caches and define a more idempotent policy around that?
original discussion carried over from https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/23406#note_122386676
My worry here is that jest may not be cleaning up after itself. The default behavior is to store everything in
/tmp/...
, which on a normal system would be wiped out when the system reboots. However, perpetuating this directory in the pipeline cache means that will never happen and the cache will grow indefinitely as more and more unique files are transpiled.TBH, this is actually a concern for some of our webpack/babel caches as well, and we probably ought to create some sort of periodic purge of the cache data.