Prefer Semgrep findings over other analyzers
Problem to solve
Within Category:SAST, we try to deduplicate findings from multiple analyzers. The current approach leads us to prefer specific analyzers over others. We made a conscious decision to prefer other analyzer over semgrep while that analyzer was being developed. However, we're now moving to standardize upon semgrep after it becomes generally available. We need to flip this relationship so the semgrep is preferred over other Category:SAST analyzers.
Proposal
Allow secondary identifier match if there is no primary identifier (or uuid) match
See #331626 (closed)
4. Implementation
(See #353271 (comment 859225232))
- When processing a
security_finding
, we attempt to findvulnerability_finding
by UUID.- If UUID match, update
vulnerability_finding
- If No UUID match, calculate UUIDs for secondary identifiers (non-generic) attempt
vulnerability_finding
lookup by UUIDs - If No UUIDs match, insert new
vulnerability_finding
.
- If UUID match, update
Note: While implementing this, it was discovered there is no need to change ANALYZER_ORDER
for existing analyzers. For existing deduplication scenarios (ex: bandit+semgrep, or gosec+semgrep, etc.) we want it to continue to work as it did before so that we don't make extra work for customers. While we still host analyzer images for analyzers that are being replaced, we need to support deduplication. As we add semgrep support for more languages, we will want to follow the pattern for bandit and spotbugs.
This page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc.