Backend: short-circuit CI pipelines based on if cache exists
<!--IssueSummary start-->
<details>
<summary>
Everyone can contribute. [Help move this issue forward](https://handbook.gitlab.com/handbook/marketing/developer-relations/contributor-success/community-contributors-workflows/#contributor-links) while earning points, leveling up and collecting rewards.
</summary>
- [Close this issue](https://contributors.gitlab.com/manage-issue?action=close&projectId=278964&issueIid=224650)
</details>
<!--IssueSummary end-->
### Problem to solve
"As a GitLab CI/CD pipeline developer, I want to skip jobs (that generate a cache) if this cache already exists (and is up-to-date), so I can get faster pipeline feedback."
Reduce pipeline build time by skipping build jobs that (a) fetch dependencies based on a rarely changing dependency definition file and (b) cache the fetched dependencies for subsequent builds. E.g. for Node.js/npm dependencies I have found to reduce build time by up to ~25% or almost 2min when the build duration is ~5+min for a small project; in comparison to https://docs.gitlab.com/ee/ci/caching/index.html#caching-nodejs-dependencies:
```
stages:
- setup
- test
cache: &global_cache
key:
files:
- package-lock.json
prefix: "${CI_PROJECT_PATH_SLUG}-${CI_COMMIT_REF_SLUG}-"
paths:
- node_modules/
- .npm/
prepare:
stage: setup
cache:
<<: *global_cache # inherit all global cache settings
rules:
- changes:
- package-lock.json
# TODO: What I want is "OR if there is no cache (with the given key) available", not sure how the syntax should look like
- !cache_exists
script:
- npm ci --cache .npm --prefer-offline --no-audit --no-optional
build_test:
stage: test
cache:
<<: *global_cache # inherit all global cache settings
policy: pull
script:
- npm run lint
- npm run test:ci
- npm run build:ci
- npm run e2e:ci
```
That is, the "prepare" job (to fetch the Node.js/NPM dependencies and cache them) must be executed when:
* either the definition of dependencies has changed (due to changes in "package-lock.json")
* or if there is no cache with the dependencies yet (or anymore, e.g. due to manually clearing/deleting the caches)
As mentioned in http://disq.us/p/22vac4a the second condition cannot be specified now:
> Just a note. The only:changes condition will only work if this is the first push to a new branch, or if you have an existing cache for the branch. Just because your package-lock.json hasn't changed, doesn't mean node_modules has been cached previously.
### Intended users
Hm, adding this flag will be the GitLab CI/CD pipeline developer, but all of the following will benefit in some way from faster pipeline speed:
* [Delaney (Development Team Lead)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#delaney-development-team-lead)
* [Sasha (Software Developer)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#sasha-software-developer)
* [Devon (DevOps Engineer)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#devon-devops-engineer)
* [Sidney (Systems Administrator)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#sidney-systems-administrator)
* [Rachel (Release Manager)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#rachel-release-manager)
* [Simone (Software Engineer in Test)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#simone-software-engineer-in-test)
### User experience goal
(See above)
### Proposal
In addition to `if`, `changes` and `exists` also add something like `cache_exists`; but negation (cf. https://gitlab.com/gitlab-org/gitlab/-/issues/34859) is here even more important! From aforementioned complete example:
```
cache:
<<: *global_cache # inherit all global cache settings
rules:
- changes:
- package-lock.json
# TODO: What I want is "OR if there is no cache (with the given key) available", not sure how the syntax should look like
- !cache_exists
```
Not sure if the cache key should be (optionally) re-specified, if it may differ from the one from the job (although I am not sure if there is any use case for that, or if that makes sense at all?)
Alternative syntax for negation:
```
rules:
...
- cache_exists: false
```
... because that might be more consistent with current syntax (although not fully, because `exists` actually takes a set of filenames) and allow the following:
```
rules:
...
- cache_exists
```
I think extending the existing `exists` to apply to the cache does not really work; and how to combine `exists` of files AND caches then?
```
rules:
...
- exists: CACHE
```
### Links / references
https://gitlab.com/gitlab-org/gitlab/-/issues/16905
https://gitlab.com/gitlab-org/gitlab-foss/-/issues/19232
https://gitlab.com/groups/gitlab-org/-/epics/2783
https://docs.gitlab.com/ee/ci/caching/index.html#caching-nodejs-dependencies
https://www.addthis.com/blog/2019/05/06/how-to-speed-up-your-gitlab-ci-pipelines-for-node-apps-by-40/#.XvWt0CgzaUk
issue