Backend: CI: File-based cache key differs though file contents are the same
There is an issue regarding CI caching based on a file-based cache key where two CI jobs calculate a different cache key even though the file contents did not change.
### GitLab Version
- GitLab CE, self-managed, `v16.11.1`
- GitLab Shell `14.35.0`
- Pipelines running on GitLab Runners (`16.11.0`) with Docker executors (using mounted `docker.sock`)
### Problem Outline
What we see is that a MR pipeline's job tries to access the cache using a different key than was used when pushing the cache from an earlier pipeline even though the file that was used as the cache's key did not change.
* The caches between protected and non-protected branches are separated (we don't see any `-protected` or `-non_protected` suffix on the cache key)
* The cache from a pipeline that ran on `main` is correctly built and pushed (stored on AWS S3):
```
Creating cache core-84bdac10a1ceb7fc92081d17feada56abc580054...
node_modules/: found <X> matching artifact files and directories
Uploading cache.zip to <REDACTED>
```
* The pipeline from a feature branch seeks the cache but fails, because the key does not exist:
```
Checking cache for core-0ced49fb798cbcaa7ead3ddbeb691ecfb2e98021...
WARNING: file does not exist
Failed to extract cache
```
* A `git diff -- yarn.lock` between the two commits in question yields an empty result
### Reproduction
(References job names given in `.gitlab-ci.yml` below)
1. No cache available at all, a pipeline on `main` runs;
2. Job `node-install-dependencies` runs, fails to find a cache, runs its script, generates and pushes a cache correctly;
2. Job `lint-all` runs, finds the cache, pulls it, succeeds;
3. Branch off of `main`, put some dummy commit on top, push and create new MR;
4. Pipeline on that MR runs;
5. Job `node-install-dependencies` is being skipped due to `rules: changes: yarn.lock` identifying correctly that the file did not change;
6. Job `lint-all` runs, fails to find the cache, fails overall;
### Expectation
I'd expect the cache keys on both `lint-all` jobs to be the exact same, since (a) the `yarn.lock` contents did not change, and (b) the `node-install-dependencies` job was correctly skipped _because `yarn.lock` did not change_.
### Context: `.gitlab-ci.yml`
```yaml
variables:
GIT_DEPTH: 0
FF_USE_FASTZIP: 'true'
ARTIFACT_COMPRESSION_LEVEL: 'fastest'
CACHE_COMPRESSION_LEVEL: 'fastest'
workflow:
rules:
- if: $CI_MERGE_REQUEST_ID
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
default:
interruptible: true
stages:
- preflight
- test
- build
- configure
- deploy
.cache-ref:
- &node_cache
key:
files:
- yarn.lock
prefix: $CI_PROJECT_NAME
paths:
- node_modules/
policy: pull
node-install-dependencies:
stage: preflight
rules:
- changes:
- yarn.lock
cache:
- <<: *node_cache
policy: pull-push
- key: ${CI_PROJECT_NAME}-${CI_JOB_NAME}
paths:
- .yarn-cache/
when: on_success
policy: pull-push
script:
- yarn install --cache-folder .yarn-cache --frozen-lockfile --non-interactive
lint-all:
stage: test
only:
- main
- merge_requests
cache:
- <<: *node_cache
script:
- npx --yes nx run-many --target=lint --all
```
issue