Skip to content

Backend: rules:changes impacted by diffs limits

Summary

rules:changes seem to be impacted by built-in limits for merge requests and diffs.

For example consider a CI/CD config with two jobs, each defining rules:changes using a wildcard path for different directories.

When the number of file changes for a merge request exceeds the defined diffs limit changes for the second directory are not visible in the changes tab.

The job which defines rules:changes targeting the missing directory for the merge request changes does not run for the merge request pipeline..

This is unexpected, no documentation.

Steps to reproduce

  1. Create a project on GitLab.com
  2. Clone the project
  3. Run the below script to create approx 3000 files in folders 1 and 2
#!/bin/bash
for n in {1..2};
do
  mkdir $n
  for i in {1..3001};
  do
    echo 'test' >> $n/$i.txt
  done
done
  1. Add, commit and push the files to Gitlab.com
  2. Add the CI/CD config below to the project

changes_in_1:
  extends:
    - .changes_in_1:build_rules
  script: 
    - echo "changes_in_1"
    
changes_in_2:
  extends:
    - .changes_in_2:build_rules
  script: 
    - echo "changes_in_2"
    
.changes_in_1:build_rules:
  rules:  
    - if: 
      changes:
      - 'config/**/*'
      - '1/**/*'
  
.changes_in_2:build_rules:
  rules:
    - if: 
      changes:
        - 'config/**/*'
        - '2/**/*'
  1. Create and switch new feature branch
  2. Run the bash script again to modify all in folders 1 and 2
  3. Add, commit and push the branch to Gitlab.com
  4. Create the merge request on GitLab.com
  5. Inspect the created merge request pipeline and note that only one job runs

Proposal

The FindChangedPaths Gitaly RPC can be used to get tree-level changes between a set of revisions in a repository and is not affected by diff limits. From what we can tell, this should fit the needed use-case for detecting changes and shouldn't require too much performance overhead. It sounds like the current approach uses the CommitDiff RPC, so using FindChangedPaths instead might improve performance.

As there remains some uncertainty about the performance impact of this change, we should put it behind a feature flag to allow for a gradual rollout.

Example Project

What is the current bug behavior?

A job where the rules:changes keyword targets a directory that contains changed files is not running for the merge request pipeline.

What is the expected correct behavior?

A job where the rules:changes keyword targets a directory that contains changed files should run for the merge request pipeline.

Output of checks

This bug happens on GitLab.com

Possible fixes and implementation details.

Use FindChangesPaths for rules:changes, more information in this thread

Edited by 🤖 GitLab Bot 🤖