Skip to content

perf(Caching): Use GitLab caching to cache our install pipeline stage

Closes #3477 (closed)

What is changing in this MR?

Change the caching configuration to properly cache the NPM modules within GitLab by following the documentation and some blogs with implementation examples. (Blog 1, Blog 2)

  • The node-push-cache job was removed because it was not doing anything.
  • A global 'pull' cache policy has been created to be inherited across all jobs that need the cached node_modules, the computed cache key is set to watch the package.json file, each time this file changes, the key will be regenerated and a new cache will be created in the pipeline.
## Global Cache
cache:
  - key:
      files:
        - package.json
    paths:
      - node_modules/
    policy: pull
  • There will be some times when there is a cache not present due to external changes, a fallback-cache anchor was created to install the dependencies in the jobs that do not find a cache to avoid these jobs failing. Unfortunately, there is no way in GitLab CI to update the cache if it is not present by changing the cache policy to pull-push dynamically, this means, that when there is not a cache present and the install job has not run, the pipeline will take longer. As long as there is a cache present in the main branch, this should not happen frequently.
## Anchors
.fallback_cache: &fallback_cache |
  if [[ -d node_modules ]]; then
    echo "Cached modules found. Using cache...";
  else
    echo "Cached modules not found. Installing modules...";
    yarn install --cache-folder .yarn-cache --frozen-lockfile --prefer-offline
  fi
  • The Install job is the one that creates the cache for the next pipeline jobs, this is the only one that has a pull-push policy to update the cache. This job has 2 cache rules, one that uses the same cache key as the global cache with a pull-push policy to update the node_modules cache. The second cache rule is to store the yarn-cache folder in each job to improve the speed of the package installation when the main cache is not found and the fallback_cache is triggered. The only rule specifies that this job will only be executed when the package.json file has changed and it will only work for Merge Requests and the main branch.
install:
  stage: .pre
  cache:
    - key:
        files:
          - package.json
      paths:
        - node_modules/
      when: on_success
      policy: pull-push # Update the cache
    - key: $CI_JOB_NAME
      paths:
        - .yarn-cache/
      when: on_success
      policy: pull-push
  script:
    - echo 'yarn-offline-mirror ".yarn-cache/"' >> .yarnrc
    - echo 'yarn-offline-mirror-pruning true' >> .yarnrc
    - yarn install --cache-folder .yarn-cache --frozen-lockfile --prefer-offline
  only:
    changes:
      - package.json
    refs:
      - merge_requests
      - main

Note: A minor change has been made to the package.json file to cache the modules in the main branch when this MR gets merged.

Metrics

All the metrics were taken in this MR context, new metrics should be taken when this MR his production.

  1. Pipeline without the install job but with cached dependencies ´~9min´.

image

  1. Pipeline with the install job that updates the cache (Is triggered by a package.json change) ´~10min´.

image

  1. Pipeline that works with the fallback_cache (The slowest one), this one will rarely/never run in the main branch ´~11min´

image

Edited by John Arias Castillo

Merge request reports