Backend: Multiple caches leading to a hash collision

Workaround

To avoid the possible "key" collisions provide a hardcoded key:prefix as a solution.

test-job:
  cache:
    - key:
        files:
          - Gemfile.lock
        prefix: '1'
      paths:
        - vendor/ruby
    - key:
        files:
          - yarn.lock
        prefix: '2'
      paths:
        - .yarn-cache/

Summary

Using multiple caches for a single project may lead to a cache collision due to how we are calculating the key.

Steps to reproduce

  • Run the pipeline
  • Create a .gitlab-ci.yaml that has multiple cache keys that depend on the lock file.
  • See cache key collision (I used css here as the code (```css) highlighter to outline the number of files and the cache key):
Creating cache 25b4b97e42e1c8264602627bd9591c7428c46135-1-non_protected...
node_modules/: found 844 matching files and directories 

Creating cache 25b4b97e42e1c8264602627bd9591c7428c46135-1-non_protected...
folder2/node_modules/: found 2946 matching files and directories 

Example Project

What is the current bug behavior?

The cache uses the same key for different files.

Even though the package.json is different in the packages it lists it still ends up as the same key.

What is the expected correct behavior?

Each key is unique.

Relevant logs and/or screenshots

Output of checks

This bug happens on GitLab.com

Results of GitLab environment info

Expand for output related to GitLab environment info

(For installations with omnibus-gitlab package run and paste the output of:
`sudo gitlab-rake gitlab:env:info`)

(For installations from source run and paste the output of:
`sudo -u git -H bundle exec rake gitlab:env:info RAILS_ENV=production`)

Results of GitLab application Check

Expand for output related to the GitLab application check

(For installations with omnibus-gitlab package run and paste the output of: sudo gitlab-rake gitlab:check SANITIZE=true)

(For installations from source run and paste the output of: sudo -u git -H bundle exec rake gitlab:check RAILS_ENV=production SANITIZE=true)

(we will only investigate if the tests are passing)

Possible fixes

I believe the following line is calculating the key - https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/pipeline/seed/build/cache.rb#L55

Edited by Laura Montemayor