Improve GitLab-maintained SAST rulesets
Contents of this epic:
- [Status](#status)
- Background
- [Context](#context)
- [Problem](#problem)
- [Opportunity](#opportunity)
- [Plan](#plan)
- [Reference materials](#materials)
## Status
We are continuing to update the GitLab SAST ruleset as part of a continuous improvement program. This epic was for a time-bound objective and is now closed.
For details on what was accomplished:
- Check the epics and issues below.
- Check the [sast-rules CHANGELOG](https://gitlab.com/gitlab-org/security-products/sast-rules/-/blob/main/CHANGELOG.md?ref_type=heads), which is continuously updated as each change is made.
## Context
GitLab SAST ships with a set of default rules. The source of the rules depends on which analyzer is used:
- The Semgrep-based GitLab SAST analyzer uses GitLab-managed rules.
- GitLab maintains and supports these rules. Historically, they're based on translations or conversions from other open-source analyzers previously used in GitLab SAST. However, these rules have also been updated over time to reflect customer feedback and GitLab knowledge about the underlying issues.
- Other analyzers ship with their default rules provided by upstream open-source scanners.
The default rules have a huge impact on user experience—they are the first taste of SAST for all users. We do offer ways to disable rules, customize metadata, or provide completely custom rules, but:
- Most customers would rather not have to apply local customizations; they'd prefer better defaults. False-positive results are a major driver of interest in customization features; the customization is a solution, not the underlying problem. We prefer to [start with the problem, not the solution](https://about.gitlab.com/handbook/product/product-principles/#our-product-principles).
- As a product, we prefer [convention over configuration](https://about.gitlab.com/handbook/product/product-principles/#convention-over-configuration), which includes having good default behavior.
- Bad results are costly for customers because, if they land on the default branch, they're persistently tracked in the Vulnerability Management database forever, leading to increased triage workload.
- The more customers have to customize SAST before its results are acceptable, the less likely they are to feel confident enabling SAST over a wide swath of projects using Scan Execution Policies or similar mechanisms.
## Problem
Some of the existing default rules provide little security value. This can cause real, actionable findings to be lost in the midst of less-actionable findings. And, ultimately, these findings reduce the likelihood that SAST becomes a key part of developer or security workflows; even a small number of incorrect or noisy findings can lead people to lose faith in scan results overall. And, unless results are reliably correct and actionable, security policies won't be implemented to require that findings be addressed.
## Opportunity
While updating the descriptions for our SAST rules, a number of opportunities for improvement were identified. We have the chance now to revisit past assumptions and make better choices about the rules we ship going forward.
With recent advancements like automatic resolution of findings for disabled or removed rules ( https://gitlab.com/gitlab-org/gitlab/-/issues/368284) and foundational improvements like using sast-rules as the single source of truth for rules in the Semgrep-based analyzer, we can now more smoothly roll out rule changes to improve the efficacy of the default rulesets.
## Plan
We will approach the changes iteratively. This will deliver improvements to customers sooner, surface unexpected side-effects earlier and at smaller scale, and reduce the risk of project failure.
1. Prepare for changes.
- **DRI:** ~"group::static analysis"
- **Status:** Complete
- **Epic:** https://gitlab.com/groups/gitlab-org/-/epics/11395+
- **Notes:**
- We rely on some previous work, like https://gitlab.com/gitlab-org/gitlab/-/issues/368284.
- We also should be sure that we are ready for many more findings to be automatically resolved, for instance by improving the clarity of auto-resolution messaging: https://gitlab.com/gitlab-org/gitlab/-/issues/417087.
- We should take the time to prepare any documentation and in-app messaging to clarify what is happening as rules change: https://gitlab.com/gitlab-org/gitlab/-/issues/417101.
2. Remove low-efficacy/low-value rules.
- **DRI:** ~"group::vulnerability research"
- **Status:** Complete
- **Epic:** https://gitlab.com/groups/gitlab-org/-/epics/8170+
- **Notes:**
- In this phase, we are only deleting rules from the default ruleset of the Semgrep-based analyzer. When shipped, these changes will cause existing findings to be automatically resolved so that users no longer have to manually triage them.
- We plan to go language-by-language to reduce risk.
- We will also have to be sure that any "shadowed" analyzers (like Flawfinder, or SpotBugs for Scala) are configured to remove any rules that we remove from the Semgrep-based analyzer ruleset.
3. Enhance detection logic for existing rules, or add brand new rules, as necessary.
- **DRI:** ~"group::vulnerability research"
- **Status:** In progress
- **Epic:** (https://gitlab.com/groups/gitlab-org/-/epics/10971+)
- **Notes:**
- This may involve splitting one existing rule to more than one new rule. For example, we may want to provide more specific remediation guidance for one database (like MySQL) compared to another database with a different SQL dialect.
- We will need to figure out the procedure for this type of change, including any possibility for automated efforts to reduce triage overhead.
4. Update severity or other important metadata.
- **DRI:** ~"group::vulnerability research" for severity judgments, ~"group::static analysis" for rollout
- **Status:** In progress
- **Epic:** https://gitlab.com/groups/gitlab-org/-/epics/10970+
- **Main bodies of work:**
1. Update OWASP mappings, and add them when missing.
- This is a purely additive step and is not blocked by any other iteration. However, we may choose to do it alongside severity changes for efficiency.
1. Standardize severities. However—changes to severity will affect charts, reports, and security policies. So, we will first assess the impact by listing out any proposed severity changes. Then, we will figure out the right actions to take to minimize impact. This could, for example, include a phased rollout until a default change in %17.0.
## Materials
[Meeting notes about rule refinement](https://docs.google.com/document/d/15MfwpOwSOU2lUwFp2Er2YJWji-AScQoug06SUisOXJc/edit) (team members only)
epic