Deduplicate SAST findings for bandit and semgrep
Following is an excerpt from this comment:
we can de-duplicate security findings if we are able to add an identifier of bandit
to the identifiers list of the semgrep
vulnerability. In order to achieve this, we need to have a mapping between bandit
identifier and semgrep
identifier inside the semgrep analyzer. So, whenever the semgrep analyzer pushes an entry to the vulnerabilities list in gl-sast-report.json
, we need to ensure that vulnerability found by semgrep
contains both bandit
and semgrep
identifiers.
Moreover, we can maintain order between bandit
(first) and semgrep
(second) similar to this analyzer order in order to ensure that bandit's source report will be processed first while we are creating the target report. In this way, we can create the final report that will contain all bandit findings + semgrep findings without duplicates.
Implementation plan
-
In MergeReportsService, keep a hash to maintain the order of report processing similar to analyzer order -
Sort the analyzers' source reports ( bandit
andsemgrep
) according to the order. We may need to write a method similar to this part. -
In semgrep analyzer, we need to make sure that semgrep
vulnerability contains bothbandit
andsemgrep
identifiers.