Many false-positives with new secrets analyzer due to new rules
Summary
The new Hashicorp Vault rules (https://gitlab.com/gitlab-org/security-products/analyzers/secrets/-/blob/master/gitleaks.toml) still cause a lot of false positives in some of our projects due to the fact that we have variables like: m_timeouts.ReadTotalTimeoutConstant that match the regex but are obviously not a token, but a variable.
It might be better to change the regex to: ('|"|\s)s.\[0-9a-zA-Z\]{24}('|"|\\n|\\r|\\s).
Context
- This rule was added in gitlab-org/security-products/analyzers/secrets!133 (comment 824852590) and noted to be overbroad in many cases.
- It was adjusted in gitlab-org/security-products/analyzers/secrets!136 (merged).
Steps to reproduce
Setup a new git repository with file "main.cs"
using System;
namespace Timeout
{
class Program
{
static void Main(string[] args)
{
m_timeouts.ReadTotalTimeoutConstant = 1;
Console.WriteLine("Timeout {}",m_timeouts.ReadTotalTimeoutConstant);
}
}
}
Run:
$ docker run -ti --rm -v $PWD:/build -w /build registry.gitlab.com/gitlab-org/security-products/analyzers/secrets:3 /bin/sh
build # /analyzer run
[INFO] [secrets] [2022-02-01T13:30:30Z] ▶ GitLab secrets analyzer v3.24.7
[INFO] [secrets] [2022-02-01T13:30:30Z] ▶ Detecting project
[INFO] [secrets] [2022-02-01T13:30:30Z] ▶ Found project in /build
[INFO] [secrets] [2022-02-01T13:30:30Z] ▶ Running analyzer
[INFO] [secrets] [2022-02-01T13:30:30Z] ▶
[INFO] [secrets] [2022-02-01T13:30:30Z] ▶ ○
[INFO] [secrets] [2022-02-01T13:30:30Z] ▶ │╲
[INFO] [secrets] [2022-02-01T13:30:30Z] ▶ │ ○
[INFO] [secrets] [2022-02-01T13:30:30Z] ▶ ○ ░
[INFO] [secrets] [2022-02-01T13:30:30Z] ▶ ░ gitleaks
[INFO] [secrets] [2022-02-01T13:30:30Z] ▶
[INFO] [secrets] [2022-02-01T13:30:30Z] ▶ 1:30PM WRN leaks found: 1
[INFO] [secrets] [2022-02-01T13:30:30Z] ▶ 1:30PM INF scan completed in 1.877823ms
[INFO] [secrets] [2022-02-01T13:30:30Z] ▶ Creating report
What is the current bug behavior?
A Hashicorp Vault service token secret is found which is actually a normal variable.
What is the expected correct behavior?
No secrets are found
Relevant logs and/or screenshots
{
"version": "14.0.0",
"vulnerabilities": [
{
"id": "a41ac825c3e84aacc59eb3fb63523b3fb1e7c4aa7e773b91940156db7e367fca",
"category": "secret_detection",
"name": "Hashicorp Vault service token",
"message": "Hashicorp Vault service token detected; please remove and revoke it if this is a leak.",
"description": "Hashicorp Vault service token",
"cve": "main.cs:e4eca0eef59d1d7eb4ed48a7e6391a21f1513391d68858bda9871c9c5ab985a2:Hashicorp Vault service token",
"severity": "Critical",
"confidence": "Unknown",
"raw_source_code_extract": "s.ReadTotalTimeoutConstant ",
"scanner": {
"id": "gitleaks",
"name": "Gitleaks"
},
"location": {
"file": "main.cs",
"commit": {
"sha": "0000000"
},
"start_line": 9,
"end_line": 9
},
"identifiers": [
{
"type": "gitleaks_rule_id",
"name": "Gitleaks rule ID Hashicorp Vault service token",
"value": "Hashicorp Vault service token"
}
]
}
],
"remediations": [],
"scan": {
"scanner": {
"id": "gitleaks",
"name": "Gitleaks",
"url": "https://github.com/zricethezav/gitleaks",
"vendor": {
"name": "GitLab"
},
"version": "8.2.7"
},
"type": "secret_detection",
"start_time": "2022-02-01T13:30:30",
"end_time": "2022-02-01T13:30:30",
"status": "success"
}
}
Output of checks
Results of GitLab environment info
The issue appears in the CI/CD compliancy framework scanners, in the secrets image with the latest gitleaks.toml: https://gitlab.com/gitlab-org/security-products/analyzers/secrets/-/blob/master/gitleaks.toml
Image: registry.gitlab.com/gitlab-org/security-products/analyzers/secrets:3
Results of GitLab application Check
Possible fixes
Change the regex to include a single quote, double quote or space before the "s":
[rules]]
id = "Hashicorp Vault service token"
description = "Hashicorp Vault service token"
regex = '''('|"|\s)s.\[0-9a-zA-Z\]{24}('|"|\\n|\\r|\\s)'''
[[rules]]
id = "Hashicorp Vault batch token"
description = "Hashicorp Vault batch token"
regex = '''('|"|\s)b\.AAAAAQ[0-9a-zA-Z_-]{156}('|"|\\n|\\r|\\s)'''
It would be good to do the same for other regex values where possible.