Prevent cache reuploads when cache was hit

Release notes

Add support for a new caching policy pull-push-fresh (naming tbd). This new policy pulls the cache if found, but only pushes if the key the cache was missed (e.g., no cache existed for the key). This is useful if you have job that fetches and caches dependencies before any of your other jobs run and you want to fan-out the remainder of the jobs using the cached dependencies. Without this using cache can actually be slower, since the first job always has to restore cache and push it.

Problem to solve

If the time to pull + time to push > time to fetch dependencies (e.g., npm ci) then using cache is actually slower than installing the dependencies. Adding this new policy allows for users to save on pipeline times by removing a potentially unnecessary push and on bandwidth to upstream registries.

Intended users

  • Software Developers
  • Anyone using caches

User experience goal

Users should be able to change their policy to pull-push-fresh and avoid potentially redundant uploads.

Proposal

Add a new cache policy to the Gitlab runner.

Available Tier

Free

What is the competitive advantage or differentiation for this feature?

Other CI platforms (e.g., Github, CircleCI) provide similar ergonomics by splitting cache save/ cache loads across different jobs/actions. Adoption of this feature would align Gitlab with the offerings of its competitors.

Errata

I'm happy to implement this feature, but I couldn't find if it had been suggested previously and if there was any reason it wasn't already supported.

Edited by 🤖 GitLab Bot 🤖