Technical Discovery: Remap SAST findings from Bandit to Semgep
Summary
This is a spin-off issue from this issue. We are planning to replace bandit with semgrep. After replacement, semgrep will find corresponding bandit vulnerabilities which were already fixed. This is annoying for the users. That is why we need to find a mechanism to remap existing bandit finding to semgrep.
In this issue, we will try to find out a generic remapping solution that can be extended to other analyzers. In other words, right now, we will try to find out a solution for Bandit and Semgrep. But, later on, we should be able to extend this solution to the remapping of eslint findings to semgrep findings.
Background
Problems
Case-1
We will establish a case to understand the problem that will arise if we don't handle the remapping of bandit and
semgrep findings. Suppose, there are 500 bandit findings for a project. Now, if we replace bandit with semgrep, we will see weird behavior in the MR widget. The MR widget will display all 500 bandit findings as fixed
and 500 new semgrep findings are found. This is very annoying for the users.
Case-2
The general users will see another behavior on the vulnerability report page. If we replace bandit with semgrep, we will see bandit findings as no longer detected
for the activity filter.
Current Implementation
Following code segments are being used to calculate the fixed
and added
findings for the latest pipeline.
Solution
We have already resolved the dedupe issue. Although both bandit and semgrep are running in parallel, we are giving priorities to bandit findings and discard all duplicate semgrep findings. We are using location fingerprint to identify the duplicates. We can do the following tasks step by step in order to address remapping issue:
-
Step 1: Make secondary identifier matching if there is no signatures match and no
uuid
match. (#331626 (closed)) -
Step 2: Make bandit and eslint identifiers as secondary identifier in semgrep findings. Ensure that
semgrep
identifier is primary identifier in thesemgrep
findings. (#331551 (closed), #331628 (closed)) -
Step 3: Give the highest priority to
semgrep
findings overbandit
andeslint
. (#328062 (closed))
The changes in rails should be able to resolve remapping
if tracking-calculator computes the signatures properly.
Pros
- This solution will resolve the remapping issues described in #322384 (closed) successfully if we follow the solution in order.
Cons
- As we are going to use secondary identifier matching, it will be computationally expensive. However, this will not make the computation much worse. We are just adding an additional matching.