Enable Secret Detection MR pipelines

Problem: Confusion and Gaps on What gets scanned When

There have been a number of cases where customers report secrets not being caught by Secret Detection. These reports usually get reported when customers add secrets, don't check the push pipeline, push additional commits, then open a MR. Secret Detection currently does not guarantee full coverage for MRs. I,e. secrets can be missed in MRs.

To understand this problem we can use this comment for context:

(The section below is copied from the comment and describes what gets scanned when)

1. `SECRET_DETECTION_HISTORIC_SCAN` is set

If SECRET_DETECTION_HISTORIC_SCAN is set then the secrets analyzer will prep the git environment by running git fetch --all which will pull in the entire history of the git repo. The entire history of the repo will be scanned.

2. `SECRET_DETECTION_LOG_OPTS` is set

If SECRET_DETECTION_LOG_OPTS is set then the secrets analyzer will first fetch the entire history of the branch/ref the pipeline is currently being run for. After that, a gitleaks scan will perform with the git log options passed to gitleaks using SECRET_DETECTION_LOG_OPTS. For example, if a user wanted to scan commits between commitA and commitB, they could use SECRET_DETECTION_LOG_OPTS: commitA^..commitB

3. Default branch

When secret detection in run on the default branch, the analyzer will run a --no-git gitleaks scan. i,e. treating the repo as a plain directory. No commit history is scanned, only the contents of the repo at the current HEAD.

4. Push Event

The secret analyzer will determine what commit range to scan on push events given the information available in the runner. There are two pieces of information that are crucial to this, CI_COMMIT_SHA and CI_COMMIT_BEFORE_SHA. CI_COMMIT_SHA is the commit at HEAD for a given branch, this value is always set for push events. CI_COMMIT_BEFORE_SHA is set in most cases. However, CI_COMMIT_BEFORE_SHA is not set for the first push event on a new branch and for merge pipelines. Because of this we cannot guarantee full secret detection coverage if a user is committing multiple commits to a new branch.

Some steps that might help visually this...

action 1: create new branch
action 2: commit commit A
action 3: commit commit B
action 4: commit commit C
action 5: push new branch to origin 
action 6: gitlab secret detection will scan commit C only since `CI_COMMIT_BEFORE_SHA` is not set.
action 7: commit commit D
action 8: commit commit E
action 9: commit commit F
action 10: push branch to origin 
action 11: gitlab secret detection will scan D through F since `CI_COMMIT_BEFORE_SHA` is now set to `commit C`

5. Merge Request

Not officially supported at this time. That said, secret detection scans will still be triggered by push events and will still show up in the ~~vulnerability dashboard for the merge request~~ Pipeline Security Tab.

I believe Secret Detection scans work as intended (according to the section above) for everything besides for MRs. This is a problem since most customers will ignore intermediate findings discovered during push events until they open an MR. This leads to an awkward experience and give users a false sense of security.

Solutions

From the discussions in this thread it looks like supporting merge request pipelines is a worthwhile approach.

Tasks

Update SD vendored template.
Add a downstream QA test. I'm not sure exactly how this will work since we will want to be kicking off a MR pipeline triggered from another project. I would assume the majority of the time spent on this issue will go into creating a reproducible test.
Update documentation. There is a statement in !108656 (merged) saying MR pipelines are unsupported; we should update to say when the template was updated to provide support for MR pipelines.

Edited Jan 13, 2023 by Connor Gilbert