Skip to content

Log number of changed files scanned by SPP

Overview

In some projects, e.g. gitlab-foss, it was identified that some CI Jobs such as prepare-as-if-foss-branch run against the master branch instead of stable branches that the merge requests are targeting. As a result, we have seen this causing timeouts such as in this CI job as a result of trying to load git diffs for all changed files.

For example, in a job targeting the %18.4 stable branch, we see:

 5770 files changed, 71730 insertions(+), 210845 deletions(-)

And for a job targeting %18.3 stable branch, we see:

 10567 files changed, 112911 insertions(+), 338166 deletions(-)

But SPP was disabled for gitlab-foss, why a timeout still occurred?

Despite SPP being disabled for the gitlab-foss project, the feature flag for SPP's dark launch mode was enabled and disabling it fixed the problem. To clarify, the dark launch mode is an experimental mode where we send git diffs that would normally be scanned by SPP to the secret-detection -service to monitor performance and gauge how scalable is the feature before we enable the feature by default for all public projects on GitLab.com (around ~6 million projects) scheduled to be done in the next 1-2 milestones.

This comment from @craigmsmith explains this further.

How we plan to mitigate this?

We're working in #569064 to rescue and log any exceptions so that similar timeouts and errors do not affect the execution flow. In addition to that, the plan is to display an informational message when an exception is caught so that the users are aware when a push succeeds but the scan fails – so as not to give a false sense of security.

However, as an additional way to understand how many changes we're scanning, @amarpatel and I discussed the idea of logging the number of changed files that are being scanned by SPP, whether this is happening as part of the dark launch mode or the standard scanning mode. This issue aims to tackle and implement this logging behaviour.

Proposal

In order to add logging as described in the overview above, we should update SPP's PayloadProcessor to identify how many changed files there are, and log that information so we could use that to understand when failures or timeouts take place.

Edited by Ahmed Hemdan