Backend: short-circuit CI pipelines based on if cache exists
<!--IssueSummary start--> <details> <summary> Everyone can contribute. [Help move this issue forward](https://handbook.gitlab.com/handbook/marketing/developer-relations/contributor-success/community-contributors-workflows/#contributor-links) while earning points, leveling up and collecting rewards. </summary> - [Close this issue](https://contributors.gitlab.com/manage-issue?action=close&projectId=278964&issueIid=224650) </details> <!--IssueSummary end--> ### Problem to solve "As a GitLab CI/CD pipeline developer, I want to skip jobs (that generate a cache) if this cache already exists (and is up-to-date), so I can get faster pipeline feedback." Reduce pipeline build time by skipping build jobs that (a) fetch dependencies based on a rarely changing dependency definition file and (b) cache the fetched dependencies for subsequent builds. E.g. for Node.js/npm dependencies I have found to reduce build time by up to ~25% or almost 2min when the build duration is ~5+min for a small project; in comparison to https://docs.gitlab.com/ee/ci/caching/index.html#caching-nodejs-dependencies: ``` stages: - setup - test cache: &global_cache key: files: - package-lock.json prefix: "${CI_PROJECT_PATH_SLUG}-${CI_COMMIT_REF_SLUG}-" paths: - node_modules/ - .npm/ prepare: stage: setup cache: <<: *global_cache # inherit all global cache settings rules: - changes: - package-lock.json # TODO: What I want is "OR if there is no cache (with the given key) available", not sure how the syntax should look like - !cache_exists script: - npm ci --cache .npm --prefer-offline --no-audit --no-optional build_test: stage: test cache: <<: *global_cache # inherit all global cache settings policy: pull script: - npm run lint - npm run test:ci - npm run build:ci - npm run e2e:ci ``` That is, the "prepare" job (to fetch the Node.js/NPM dependencies and cache them) must be executed when: * either the definition of dependencies has changed (due to changes in "package-lock.json") * or if there is no cache with the dependencies yet (or anymore, e.g. due to manually clearing/deleting the caches) As mentioned in http://disq.us/p/22vac4a the second condition cannot be specified now: > Just a note. The only:changes condition will only work if this is the first push to a new branch, or if you have an existing cache for the branch. Just because your package-lock.json hasn't changed, doesn't mean node_modules has been cached previously. ### Intended users Hm, adding this flag will be the GitLab CI/CD pipeline developer, but all of the following will benefit in some way from faster pipeline speed: * [Delaney (Development Team Lead)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#delaney-development-team-lead) * [Sasha (Software Developer)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#sasha-software-developer) * [Devon (DevOps Engineer)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#devon-devops-engineer) * [Sidney (Systems Administrator)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#sidney-systems-administrator) * [Rachel (Release Manager)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#rachel-release-manager) * [Simone (Software Engineer in Test)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#simone-software-engineer-in-test) ### User experience goal (See above) ### Proposal In addition to `if`, `changes` and `exists` also add something like `cache_exists`; but negation (cf. https://gitlab.com/gitlab-org/gitlab/-/issues/34859) is here even more important! From aforementioned complete example: ``` cache: <<: *global_cache # inherit all global cache settings rules: - changes: - package-lock.json # TODO: What I want is "OR if there is no cache (with the given key) available", not sure how the syntax should look like - !cache_exists ``` Not sure if the cache key should be (optionally) re-specified, if it may differ from the one from the job (although I am not sure if there is any use case for that, or if that makes sense at all?) Alternative syntax for negation: ``` rules: ... - cache_exists: false ``` ... because that might be more consistent with current syntax (although not fully, because `exists` actually takes a set of filenames) and allow the following: ``` rules: ... - cache_exists ``` I think extending the existing `exists` to apply to the cache does not really work; and how to combine `exists` of files AND caches then? ``` rules: ... - exists: CACHE ``` ### Links / references https://gitlab.com/gitlab-org/gitlab/-/issues/16905 https://gitlab.com/gitlab-org/gitlab-foss/-/issues/19232 https://gitlab.com/groups/gitlab-org/-/epics/2783 https://docs.gitlab.com/ee/ci/caching/index.html#caching-nodejs-dependencies https://www.addthis.com/blog/2019/05/06/how-to-speed-up-your-gitlab-ci-pipelines-for-node-apps-by-40/#.XvWt0CgzaUk
issue