[Spike] Exclusions for Pipeline SD
Overview
The Secret Detection Exclusions feature was introduced as part of the Secret Push Protection GA release. Exclusions are stored in the database and retrieved/applied during scanning for secrets. Those exclusions handle three types of secrets:
- Secrets matching a rule from the default ruleset, e.g.
gitlab_pipeline_trigger_token. - Secrets found in file matching a path that is either specific (e.g.
spec/app/project_spec.rb) or a simple glob (e.g.spec/**/*.rb). - Secrets matching a specific raw value, e.g.
dummyfaketoken-1234567890.
This spike aims to explore the technical feasibility and the direction for having those exclusions applied in Pipeline SD which runs isolated in a container as part of CI job and doesn't have access to GitLab's monolith database.
Proposal
See also the discussion below for more information.
We came up with two approaches to each aspect of applying exclusions to Pipeline SD, and want to confirm the viability of at least one of them:
1️⃣ Exclusions Retrieval and Injection
Injecting exclusions via CI Job Artifacts
In this approach, before a secret-detection job runs, we load a project SD exclusions from database and write them down into a job artifact that can be read by the secret_detection job. If the artifact is found, secrets analyzer reads the file and applies the exclusions.
This is ruled out in favour of the two other approaches described below.
Injecting exclusions via a file stored in the working directory
In this approach, a before_script is added to the Secret Detection CI template and is used to:
- Call an API endpoint (e.g. GraphQL
Project.securityExclusionsquery) to retrieve exclusions. - Save the list of exclusions to a file stored in the working directory of the container running
secret_detectionjob.
Then the analyzer could read the exclusions from that file during scanning and apply those exclusions.
Injecting exclusions via ENVs
Another idea is to similarly inject the exclusions (retrieved through an API endpoint as discussed above) but instead of saving them to a file, we pass the list of exclusions as environment variable to the analyzer on initialization.
This isn't an ideal solution though, because ENVs is too simplistic for our needs, but still worth exploring.
2️⃣ Exclusions Processing
Processing exclusions via Analyzer Engine (i.e. common module)
One way to process exclusions is to add the logic handling that to the common module. In that scenario, we read the exclusions from either the file stored in the working directory, or from environment variables, and apply the exclusions before the report is generated similar to how we filter out vulnerabilities/secrets from disabled rules at the moment in the report module.
Processing exclusions via CI Components
To process exclusions, another idea is to introduce a new CI component that runs through the report generated by the secret-detection CI template or component and excludes findings based on them matching any type of secrets that were excluded for a project.
Progress
Below is a summary of the approaches discussed in the proposal section above:
| Exclusion Injection | Exclusion Processing | POC/Demo |
|---|---|---|
| Stored in File | Applied in secrets analyzer (before report is generated) |
See #503184 (comment 2321427953). |
| Stored in File | Applied in a new CI Component (after report is generated) | See #503184 (comment 2332994757). |
| Passed in ENVs | Applied in secrets analyzer (before report is generated) |
See #503184 (comment 2321427953). |
| Passed in ENVs | Applied in a new CI Component (after report is generated) | See #503184 (comment 2333109292). |
Note: For all approaches, exclusions are retrieved from the database via the GraphQL API