Prefer Semgrep findings over other analyzers

Problem to solve

Within Category:SAST, we try to deduplicate findings from multiple analyzers. The current approach leads us to prefer specific analyzers over others. We made a conscious decision to prefer other analyzer over semgrep while that analyzer was being developed. However, we're now moving to standardize upon semgrep after it becomes generally available. We need to flip this relationship so the semgrep is preferred over other Category:SAST analyzers.

Proposal

Allow secondary identifier match if there is no primary identifier (or uuid) match

See #331626 (closed)

4. Implementation

(See #353271 (comment 859225232))

When processing a security_finding, we attempt to find vulnerability_finding by UUID.
1. If UUID match, update vulnerability_finding
2. If No UUID match, calculate UUIDs for secondary identifiers (non-generic) attempt vulnerability_finding lookup by UUIDs
3. If No UUIDs match, insert new vulnerability_finding.

Note: While implementing this, it was discovered there is no need to change ANALYZER_ORDER for existing analyzers. For existing deduplication scenarios (ex: bandit+semgrep, or gosec+semgrep, etc.) we want it to continue to work as it did before so that we don't make extra work for customers. While we still host analyzer images for analyzers that are being replaced, we need to support deduplication. As we add semgrep support for more languages, we will want to follow the pattern for bandit and spotbugs.

This page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc.

Edited Apr 21, 2022 by rossfuhrman