semgrep-sast sarif translator does not account for vulnerabilities ignored and labelled as suppressed.
Summary
From semgrep version 0.61.0, when running a semgrep-sast scan on files that have inline comments to ignore findings, semgrep includes ignored findings, but labels them as suppressed in the corresponding semgrep.sarif
output file.
However, the sarif translator that translates the .sarif file to a GitLab SAST report does not account for this labeling and instead, also includes the ignored findings in the reports as vulnerabilities but are not labelled as dismissed/suppressed.
This leads to a gl-sast-report.json
file that contains inaccurate information on vulnerabilities that one would like to ignore/dismiss.
Steps to reproduce
- Create 2 test files in a folder in a VM or local pc:
test-no-ignore.py
andtest-yes-ignore.py
. Each have the following contents:
- test-no-ignore.py
import requests
url = "https://google.com"
requests.head(url, verify=False)
- test-ignore-yes.py
import requests
url = "https://google.com"
requests.head(url, verify=False) # nosemgrep
- While in the folder with the 2 files, pull the analyzer docker container and override the the entry point to "/bin/sh"
$ docker run -v "${PWD}:/src" --entrypoint "/bin/sh" -it registry.gitlab.com/gitlab-org/security-products/analyzers/semgrep
- Confirm the semgrep version and run the analyzer manually.
# semgrep --version
0.69.1
# /analyzer run --target-dir /src
- A
gl-sast-report.json
file will be produced at/
and asemgrep.sarif
file at/src
-
(optional) Install jq for better output with semgrep.sarif (
apk add jq
) -
Check out the output of the 2 files.
# cat /src/semgrep.sarif | jq | less
...
...
"results": [
{
"locations": [
{
"physicalLocation": {
"artifactLocation": {
"uri": "/src/test-no-ignore.py",
"uriBaseId": "%SRCROOT%"
},
"region": {
"endColumn": 33,
"endLine": 5,
"startColumn": 1,
"startLine": 5
}
}
}
],
"message": {
"text": "Certificate verification has been explicitly disabled. This\npermits insecure connections to insecure servers. Re-enable\ncertification validation.\n"
},
"ruleId": "bandit.B501"
},
{
"locations": [
{
"physicalLocation": {
"artifactLocation": {
"uri": "/src/test-yes-ignore.py",
"uriBaseId": "%SRCROOT%"
},
"region": {
"endColumn": 33,
"endLine": 5,
"startColumn": 1,
"startLine": 5
}
}
}
],
"message": {
"text": "Certificate verification has been explicitly disabled. This\npermits insecure connections to insecure servers. Re-enable\ncertification validation.\n"
},
"ruleId": "bandit.B501",
"suppressions": [
{
"kind": "inSource"
}
]
}
],
...
...
# cat gl-sast-report.json
{
"version": "14.0.0",
"vulnerabilities": [
{
"id": "34b3baf0bee440208167fb2f58c47951b11fafe8dae1b7afaa552285f360c1cf",
"category": "sast",
"message": "Improper Certificate Validation",
"description": "Certificate verification has been explicitly disabled. This\npermits insecure connections to insecure servers. Re-enable\ncertification validation.\n",
"cve": "",
"severity": "Critical",
"scanner": {
"id": "semgrep",
"name": "Semgrep"
},
"location": {
"file": "src/test-no-ignore.py",
"start_line": 5,
"end_line": 5
},
"identifiers": [
{
"type": "semgrep_id",
"name": "bandit.B501",
"value": "bandit.B501",
"url": "https://semgrep.dev/r/gitlab.bandit.B501"
},
{
"type": "cwe",
"name": "CWE-295",
"value": "295",
"url": "https://cwe.mitre.org/data/definitions/295.html"
},
{
"type": "owasp",
"name": "Sensitive Data Exposure",
"value": "A3"
},
{
"type": "bandit_test_id",
"name": "Bandit Test ID B501",
"value": "B501"
}
]
},
{
"id": "a6db051149d01d41fbac78839ab688659d7a8110b6f9bf2b6c46e1a5cb3ace7c",
"category": "sast",
"message": "Improper Certificate Validation",
"description": "Certificate verification has been explicitly disabled. This\npermits insecure connections to insecure servers. Re-enable\ncertification validation.\n",
"cve": "",
"severity": "Critical",
"scanner": {
"id": "semgrep",
"name": "Semgrep"
},
"location": {
"file": "src/test-yes-ignore.py",
"start_line": 5,
"end_line": 5
},
"identifiers": [
{
"type": "semgrep_id",
"name": "bandit.B501",
"value": "bandit.B501",
"url": "https://semgrep.dev/r/gitlab.bandit.B501"
},
{
"type": "cwe",
"name": "CWE-295",
"value": "295",
"url": "https://cwe.mitre.org/data/definitions/295.html"
},
{
"type": "owasp",
"name": "Sensitive Data Exposure",
"value": "A3"
},
{
"type": "bandit_test_id",
"name": "Bandit Test ID B501",
"value": "B501"
}
]
}
],
"remediations": [],
"scan": {
"scanner": {
"id": "semgrep",
"name": "Semgrep",
"url": "https://github.com/returntocorp/semgrep",
"vendor": {
"name": "GitLab"
},
"version": "0.69.1"
},
"type": "sast",
"start_time": "2021-11-01T23:18:19",
"end_time": "2021-11-01T23:18:28",
"status": "success"
}
}
NOTE the "suppressions": [{"kind": "inSource"}]
entry that labels this ignored finding as suppressed but still includes it in the report
- Do step 2-5 once again with the only change being in step 2, i.e. use
registry.gitlab.com/gitlab-org/security-products/analyzers/semgrep:2.10
(alternatively, install a different version of semgrep:pip install semgrep==0.60.0
)
- This is the last image tag (and 2.10.1 too) that used semgrep v.60.0, that did not have
includes ignored findings, but labels them as suppressed
feature. Images from 2.11 use semgrep v.65.0 onwards.
The feature in question was introduced in semgrep v0.61.0: https://github.com/returntocorp/semgrep/commit/3eac29d1b25db44087497c31763c75a9fcfb56cc
- You will notice that both the json and sarif files have one less entry i.e. they lack an entry for
test-yes-ignore.py
- Test the same in the UI by viewing the artifacts (gl-sast-report.json) as seen in the example projects below.
Example Project
-
MR with pipeline and artifacts for latest image (and semgrep==0.69.1): chris-semgrep/semgrep-ignore-0.69.1!1 (closed)
-
MR with pipeline and artifacts for image version 2.10 (and semgrep==0.60.0): chris-semgrep/semgrep-ignore-0.60.0!1 (closed)
What is the current bug behavior?
Semgrep scans utilizing semgrep v0.61.0 includes ignored findings in the output/report and labels them as suppressed but the sarif translator does not account for this and therefore the resulting gl-sast-json.report
also includes ignored findings but does not label them as suppressed or dismissed.
What is the expected correct behavior?
Sarif translator should recognize suppressed findings and not include them in the final report or include them but label them as dismissed if possible.
Relevant logs and/or screenshots
Image: registry.gitlab.com/gitlab-org/security-products/analyzers/semgrep and Semgrep: 0.69.1
Image: registry.gitlab.com/gitlab-org/security-products/analyzers/semgrep:2.10 and Semgrep: 0.60.0
Output of checks
Happens on both Self-Managed and SaaS instances since they all pull from the same registry by default.
Results of GitLab environment info
N/A
Results of GitLab application Check
N/A
Possible fixes
After getting a shell in the docker image, the file /usr/local/lib/python3.9/site-packages/semgrep/formatter/sarif.py
in line 124
, the keep_ignores
method is inherited and overridden from /usr/local/lib/python3.9/site-packages/semgrep/formatter/base.py to return True
.
We can either patch this to False
or make changes to our Sarif translator to accommodate suppressed findings from semgrep and be able to label them as dismissed or suppressed in the final output.