Cache unstable between stages
We have a master and dev branch on which we push regularly. Our CI is set so that node_modules are cached between the stages of the same branch, as the docs recommend.
One of our jobs relies on a modules installed by npm ci in a setup job. Subsequent jobs do not always find the node_modules directory, even though the cache is always extracted correctly.
See the following example:
- First commit fails on
master
The setup job logs:
```
Running with gitlab-runner 11.8.0 (4745a6f3)
on b34672608ba1 8a6fcae6
Using Docker executor with image my.registry.com/my-npm-ci-docker-image ...
Pulling docker image my.registry.com/my-npm-ci-docker-image ...
Using docker image sha256:0a4c2bee112563b0e3e32dd04d627e8a012b913ef1c0baa9c3c9bab6f07812be for my.registry.com/my-npm-ci-docker-image ...
Running on runner-8a6fcae6-project-108-concurrent-1 via 3109e888ae36...
Cloning repository...
Cloning into '/builds/my-project'...
Checking out c6bc390a as master...
Skipping Git submodules setup
Checking cache for master...
No URL provided, cache will not be downloaded from shared cache server. Instead a local version of cache will be extracted.
Successfully extracted cache
Git git version 2.11.0
Node v10.15.3a
Npm 6.4.1
$ npm ci
> fsevents@1.2.7 install /builds/my-project/node_modules/fsevents
> node install
> husky@1.3.1 install /builds/my-project/node_modules/husky
> node husky install
husky > setting up git hooks
CI detected, skipping Git hooks installation.
added 1326 packages in 18.252s
Git git version 2.11.0
Node v10.15.3a
Npm 6.4.1
Creating cache master...
node_modules/: found 19120 matching files
No URL provided, cache will be not uploaded to shared cache server. Cache will be stored only locally.
Created cache
Job succeeded
```
The `release` job logs:
```
Running with gitlab-runner 11.8.0 (4745a6f3)
on b34672608ba1 8a6fcae6
Using Docker executor with image my.registry.com/my-npm-ci-docker-image ...
Pulling docker image my.registry.com/my-npm-ci-docker-image ...
Using docker image sha256:7121107d53228c008d3f70a2f5facd2f1549b0c25a19321b181592c1c6271895 for my.registry.com/my-npm-ci-docker-image ...
Running on runner-8a6fcae6-project-108-concurrent-0 via 3109e888ae36...
Fetching changes...
Removing node_modules/
HEAD is now at 4c93117 chore: write more tests
Checking out c6bc390a as master...
Skipping Git submodules setup
Checking cache for master...
No URL provided, cache will not be downloaded from shared cache server. Instead a local version of cache will be extracted.
Successfully extracted cache
Git git version 2.18.1
Node v10.15.2a
Npm 6.4.1
Docker Docker version 18.06.1-ce, build d72f525745
$ ./node_modules/semantic-release/bin/semantic-release.js
/bin/sh: eval: line 71: ./node_modules/semantic-release/bin/semantic-release.js: not found
Git git version 2.18.1
Node v10.15.2a
Npm 6.4.1
Docker Docker version 18.06.1-ce, build d72f525745
ERROR: Job failed: exit code 127
```
- Second commit (rebase
devonmaster) succeeds ondev - Third commit succeeds on
master. The only difference between the first and third commit is that I added a blank line in the README to force gitlab to run a new pipeline, as I can't retry the job because it relies on a module installed in the cache by the setup job.
Here is our CI configuration:
gitlab-ci.yml
cache:
key: ${CI_COMMIT_REF_SLUG}
paths:
- node_modules/
stages:
- setup
- release
setup:
stage: setup
image: my-npm-ci-docker-image
script:
- npm ci
# build + test jobs
release:
stage: release
image: my-npm-ci-docker-image
script:
- ./node_modules/semantic-release/bin/semantic-release.js
# quality job
And various versions: GitLab: 11.8.2 (self hosted via docker) GitLab Runner: 11.8.0 (self hosted via docker)
Note that:
-
Other source of the repository are not relevant.
-
I know I could use
npxas a workaround. -
The runner is using a custom docker image that embeds NPM, Git, and Node. The logs
Git git version 2.11.0 Node v10.15.3a Npm 6.4.1are written by this custom docker image. I am not sure why it is printed twice as it is just printed in the container entry point. Gitlab should log this only once - it's as if it were storing the entry point logs into a buffer an re-printing it at the end. Or it's trying to run the container twice, but the second time it passes en empty script ? Anyways this is probably not related as the job works most of the time.
