Advanced SAST: report Unverified Vulnerabilities (Beta) (#15649) · Epics · GitLab.org

Advanced SAST: report Unverified Vulnerabilities (Beta)

## Executive Summary \_**Very short version:** Improve user perception of scan recall by relaxing the requirement that a taint-based vulnerabilities be traced all the way to a defined source of user input. (Improving recall is the same as improving the false negative rate.) Advanced SAST reports taint-based vulnerabilities only when we can verify that user input flows all the way to a dangerous "sink" where it is used in a way that creates a security risk. (Taint-based vulns include many of the most severe vulnerabilities, like SQL injection, XSS, and other injections.) This is a great way to reduce noise from vulnerabilities that are not truly exploitable and is a major reason why users report that Advanced SAST reduces false positives compared to other solutions. However, being so strict can lead to false-negative results (perceived or actual) if: 1. We're right, and the code is not exploitable today—but the user wants to fix it anyway. - This is a completely reasonable policy for an organization to have. Essentially, it reduces risk that there is a land-mine sitting in the code waiting to be stepped on by a new feature. 2. We're wrong: the code is actually exploitable today, but we cannot track it for a technical reason. This can be for various reasons: - we have not handled a particular language construct correctly, for example tracking taint through a particular data structure - a rule is incorrect or missing (for a source of user input, most likely) - another bug or technical limitation prevents the tracking of the vulnerability. To address this, we will broaden our detection to report risky practices or "near misses" where we've discovered part of a vulnerability but cannot trace it all the way from source to sink. We will design the detection logic changes and the user interactions to mitigate the risk that, by solving a problem with false negatives, we create a problem of user-perceived false positives. ### Engineering Assessment Minimum criteria assessment only: - A configuration toggle to enable the behavior. - This does not include any sort of UI. It could be a yaml/toml configuration, or environment variable that changes the behavior of the analyzer. - The analyzer uses the existing schema to annotate "sink-only" findings. - The [`details`](https://gitlab.com/gitlab-org/security-products/security-report-schemas/-/blob/master/src/vulnerability-details-format.json?ref_type=heads) property can be used for this, and the information appears in the vulnerability page without additional FE effort. - ~"group::vulnerability research" recommends which rules should be affected by the "sink-only" behavior. - Any changes required would be carried by ~"group::static analysis" and reviewed by VR. - The analyzer must not report duplicated findings. I.e.: if a source-sink match is already reported, then a sink-only match must not be reported. - Other minor parts of the report need adjustment, as described in [Minimum Acceptable Workflow](#minimum-acceptable-workflow) (e.g. add "Risk of" to finding title). - Potential ~spike issues: - The behavior of the Code Flow feature for sink-only matches. - Vuln. Mgmt behavior when a sink-only match becomes a source-sink match. In this case, we want to remove the sink-only metadata so that the user knows that the vulnerability is now exploitable. ### Dependencies - Team dependencies: - To meet minimum criteria, ~"group::static analysis" should be able to update the engine, possibly with ~"group::vulnerability research" assistance to coordinate any rule metadata or logging changes. - Consultation might be needed from ~"group::security insights" if we need to clarify the UI's handling of "source" semantics or similar. Note that engineers who are now in ~"group::security platform management" implemented this feature during the Sec realignment. - To meet best-case criteria (which are not required for an initial delivery), we would require: - Possible schema updates or new filtering from ~"group::security insights". - A new policy attribute from ~"group::security policies". - For both of these possible dependencies: note that ideally this would be aligned with in with reachability/validity attributes for SCA and Secret Detection, which could mean that little or no SAST-specific additional work is needed. - Epic/Issue dependencies - _Link to dependent epics/issues via the linked items widget below for ease of drill down_ - External dependencies: \[Any external dependencies\] <details> <summary> ## Other Interlock administrivia </summary> #### DRIs - **PM**: @connorgilbert  - **EM**: @thiagocsf  - **UX/PDM**: No stable counterpart; cc @jmandell  - **Group(s)**: ~"group::static analysis"  - **Engineering Owner**: @twoodham #### Initiative Driver - Product or Engineering? - [ ] **Product-driven initiatives (P1/P2/P3)** - Customer-facing features or improvements driven by Product teams that require engineering resources and commitment - These initiatives require a Product Priority label (P1/P2/P3) - They may also receive GTM tier labels (T1/T2/T3) for external communication - [ ] **Engineering-driven initiatives (E1/E2/E3)** - Internal technical improvements that may not have customer-facing components - These initiatives require an Engineering Priority label (E1/E2/E3) - They have internal visibility only and are not externally communicated - Examples include: technical debt reduction, infrastructure improvements, refactoring, dependency upgrades #### Sizing and Funding (Optional) - **Size**: \[XS/S/M/L/XL\] - **Funding Status**: \[Funded/Partially funded/Not funded\] #### Hygiene Guidelines :bulb: \_See additional details about this process at https://handbook.gitlab.com/handbook/product-development/r-and-d-interlock/ ##### :one: Pre-Interlock - [ ] Update epic description with all relevant information - [ ] Ensure all dependencies are identified - [ ] Apply appropriate labels (see below) - [ ] Apply target delivery Milestone - [ ] Update interlock status as discussions progress (via label) ##### :two: Post-Interlock: once quarter begins - Update health status weekly (via label) - Document any newly identified risks or dependencies - Link to implementation epics/issues as work begins - Flag any scope or timeline changes immediately  </details> ## Motivation We currently have many user reports of FNs, and we observe these in internal benchmarking as well. We are penalized for these FNs in evaluations, and they cause escalations to ~"group::static analysis" and ~"group::vulnerability research". We cannot have ruleset updates be the only response. With Advanced SAST, the cost of FNs is higher to us currently than FPs— - False negatives make you question if a tool is trustworthy at all—it is missing a risk you expected to find! - By comparison, FPs primarily create noise. Missing results is a more severe concern and impossible to self-remediate as a user; FPs, while still important, are less severe and can be addressed, at least partially, with customizations like disabling rules. This particular feature helps us bridge the gap from what the engine can already find (sinks and propagation) without requiring every single source and propagator to be tracked correctly. ### Note on definitions "False Positive" and "False Negative" sound very cut-and-dry and binary. But in practice, when people assess if a vuln is one or the other, there is often an element of subjectivity-some people truly want to see the "noise" to feel more confident that they aren't missing something, while others will say a result is an FP if it's not 100% exploitable when taking into account any other countermeasures like network firewalls. Today our strict definitions prevent us from meeting this nuance because we omit anything that isn't verified. ## Feature details ### Note on terminology We've described this feature as identifying "risky practices" and "near-miss findings". It could also be called "incipient vulnerabilities" or similar. We will need to settle on a particular term that we find to be clear and descriptive, and use it consistently. ### Core logic The core of the feature is the detection engine returning partial results when a sink is located, but no defined source. The engine should return this result, appropriately annotated, to be surfaced in the UI. Note: to avoid confusion, we shouldn't duplicate or overlap with actual exploitable paths to the same sink. We can settle on a specific definition based on how the feature is going to be implemented, but can use this as a starting point: "do not report a near-miss result if a full source-to-sink path has already been identified for a given sink". ### User experience The core balance we are seeking here is: - We want to show additional results... - But we want to be clear that they are not verifiably exploitable today. These partial/near-miss results are legitimate issues, and _we do believe they should be fixed._ They are not noise. The distinction is that we can not verify that they are processing unsanitized user input, so we can't prove that they are both severe and urgent. #### Downstream change(s) We will need to verify that existing features that display vulns aren't going to become incorrect when we start reporting these results. Specifically, we need to be sure that "source" is not being calculated implicitly, for example always labeling `flow[0]` as the "source" in a source-to-sink flow. This is because we will now have partial flows that have propagators and a sink, but do not have a defined source of user input like an HTTP API handler or similar. #### Minimum acceptable workflow The more we can get out of the "best-case" section below, the better, but we can ship the feature under the following constraints: - Title clearly indicates that this is a possible vuln but not verified to be exploitable by appending the word "Unverified". "Unverified SQL injection" for example. - Note: Vuln titles are currently always CWE names, which makes findings harder to read. There is a related issue to make vuln titles more descriptive (https://gitlab.com/groups/gitlab-org/-/epics/15356+). To avoid confusion absolutely must not report these new vulnerabilities without some "unverified" indication. - Text at the top of the vulnerability description explains that we found the risky practice but could not trace it fully to user input. - This text can be in the vulnerability report output or in the UI natively (only for this type of result). - Sink location is shown in the code flow view. - Severity is capped at Medium, regardless of CWE or the severity the vulnerability would have if actually exploitable. - This is to avoid being too noisy with less-validated findings. This is a crude limitation but will help us avoid security requirements that focus on High and Critical findings, without always slipping into the lower-value Low/Info buckets. - Filtering in the vulnerability report can be done with an identifier or with text, but we do not need a specific dedicated filter if we can't get one in time. * If we can get a filter in time, the name of the filter could be "Verification status" with the options of "Verified vulnerability" and "Unverified vulnerability" that are both enabled by default. #### Best-case workflow The more of these behaviors we have, the better: - Exploitability is explicitly defined using a similar rating system as CVES use for static reachability, and similar to verifiably active secrets, for alignment across scan types. - Exploitability is filterable on reporting pages. - Exploitability is decorated on MR widget & other UIs, so users can focus on verifiably exploitable findings. - Exploitability is available as a policy attribute, so users can target only verifiably exploitable findings for policy control. - Engine can specify a level of confidence. This is related to exploitability but not identical. (It also will be useful for other rules, not just this particular type of near miss/partial taint match.) - Note that SAST will always need to specify some level of confidence — this is not just a temporary accommodation for ruleset issues. Today we acommodate the lack of a confidence field by tweaking the severity value for different rules within the same vulnerability class. ### Rollout control We do not need to offer a new configuration option or feature flag to identify these vulnerabilities. However, we do need to be sure that the "near miss" version of a particular taint-based rule is distinguishable from a fully-validated one, so that users can disable the near-miss variant if it's too noisy for them. This means that at least one identifier must be different between these result types, so that [existing customization features](https://docs.gitlab.com/user/application_security/sast/customize_rulesets/#disable-predefined-gitlab-advanced-sast-rules) can be used to target the partially-traced findings specifically, without affecting fully-traced results. ## Related discussions In early discussion about this feature, we discussed ways to limit disruption/FPs: - Limit this only to particular types of near-vulnerabilities. https://gitlab.com/groups/gitlab-org/-/epics/15649#note_2199476091 - Provide less-severe detection results for types of input that are less clearly user-controlled. https://gitlab.com/groups/gitlab-org/-/epics/15649#note_2199481190 We should evaluate if there is anything to learn from those discussions. If not, they can be closed or moved out of the epic. ### Target Metric * Reduction in potential false negatives for Advanced SAST, which can lead to an increase in adoption and better performance on competitive bakeoffs. See [my internal comment](https://gitlab.com/groups/gitlab-org/-/epics/15649#note_2890280911) for more details.

epic