Backend: rules:changes impacted by diffs limits
Summary
rules:changes seem to be impacted by built-in limits for merge requests and diffs.
For example consider a CI/CD config with two jobs, each defining rules:changes using a wildcard path for different directories.
When the number of file changes for a merge request exceeds the defined diffs limit changes for the second directory are not visible in the changes tab.
The job which defines rules:changes targeting the missing directory for the merge request changes does not run for the merge request pipeline..
This is unexpected, no documentation.
Steps to reproduce
- Create a project on GitLab.com
- Clone the project
- Run the below script to create approx 3000 files in folders
1and2
#!/bin/bash
for n in {1..2};
do
mkdir $n
for i in {1..3001};
do
echo 'test' >> $n/$i.txt
done
done
- Add, commit and push the files to Gitlab.com
- Add the CI/CD config below to the project
changes_in_1:
extends:
- .changes_in_1:build_rules
script:
- echo "changes_in_1"
changes_in_2:
extends:
- .changes_in_2:build_rules
script:
- echo "changes_in_2"
.changes_in_1:build_rules:
rules:
- if:
changes:
- 'config/**/*'
- '1/**/*'
.changes_in_2:build_rules:
rules:
- if:
changes:
- 'config/**/*'
- '2/**/*'
- Create and switch new feature branch
- Run the bash script again to modify all in folders
1and2 - Add, commit and push the branch to Gitlab.com
- Create the merge request on GitLab.com
- Inspect the created merge request pipeline and note that only one job runs
Proposal
The FindChangedPaths Gitaly RPC can be used to get tree-level changes between a set of revisions in a repository and is not affected by diff limits. From what we can tell, this should fit the needed use-case for detecting changes and shouldn't require too much performance overhead. It sounds like the current approach uses the CommitDiff RPC, so using FindChangedPaths instead might improve performance.
As there remains some uncertainty about the performance impact of this change, we should put it behind a feature flag to allow for a gradual rollout.
Example Project
What is the current bug behavior?
A job where the rules:changes keyword targets a directory that contains changed files is not running for the merge request pipeline.
What is the expected correct behavior?
A job where the rules:changes keyword targets a directory that contains changed files should run for the merge request pipeline.
Output of checks
This bug happens on GitLab.com
Possible fixes and implementation details.
Use FindChangesPaths for rules:changes, more information in this thread