Secret Detection is inaccurate for large Merge Requests
Summary
Secret Detection in Merge Requests with more than 50 commits will not scan the correct commits, which can lead to both false negatives and false positives.
Steps to reproduce
- Create a branch with 70+ commits.
- Add secret detection to the branch.
- Open a Merge Request.
- Check the
secret_detection
job. It will not scan the correct commits, leading to inaccurate results.
Example Project
https://gitlab.com/morgan_delagrange/secrets-detection-debug
It has two MRs with virtually identical changes, but the one with more than 50 commits is incorrect.
The MR with 12 commits scans 11 commits, and its secret_detection
job returns a correct result.
The MR with 71 commits only scans 49 commits, and its secret_detection
job returns a false positive.
What is the current bug behavior?
The MR job does not scan the correct commits for large branches.
What is the expected correct behavior?
The MR job should scan the correct commits regardless of branch size.
Output of checks
This bug happens on GitLab.com and on self-hosted GitLab. See "Example Project" above for examples on GitLab.com.
Possible fixes
The secrets analyzer tries to deepen the clone, but in a normal clone with GIT_DEPTH of 50 it won't have enough history to generate the correct range. I think that logic should be removed. If your clone is too shallow, it doesn't work correctly. If your clone is already deep enough, then it's unnecessary.
You should be able to fix it by:
- Releasing a version of the secrets analyzer that doesn't try to count the commits in the branch: gitlab-org/security-products/analyzers/secrets!252 (merged)
- Updating
Jobs/Secret-Detection.latest.gitlab-ci.yml
to:- Use the new secrets analyzer version.
- Do a full blobless clone rather than a shallow clone: https://gitlab.com/morgan_delagrange/gitlab/-/compare/master...blobless-clone-for-secrets-analyzer?from_project_id=278964&straight=false That should have good performance without interfering with the git history.
Here is an example of a MR that uses a build of the remove-git-deepen-for-mrs
branch above and blobless cloning: morgan_delagrange/secrets-detection-debug!3 (diffs) Its secret_detection
scans the correct number of commits (72), and it does not report a false positive: https://gitlab.com/morgan_delagrange/secrets-detection-debug/-/jobs/5253725959