Provide and version semgrep detection rules separately from SAST analyzers
Proposal
The current process for adding and deploying new semgrep rules is a bit cumbersome. First, an engineer must test and add rules to the sast-rules project. The rules committed to that project are then packaged. Instead of automatically using these packaged rules in the semgrep analyzer, we manually have to download, untar, and open an MR to contribute rules to the semgrep analyzer.
We should remove the need to manually download and untar semgrep rules from sast-rules packages. Instead we should automate a way to load in semgrep rules. Potential changes are available for review here: gitlab-org/security-products/analyzers/semgrep!147 (closed).
Improvements
Separation between the central rule repository and the analyzer improves the status-quo in the following ways
- Better Quality Assurance
- Rule/Configuration Management (tracking changes and their effects)
- Enforcement of data-policies that we need internally (e.g., CWE information, etc.)
- Deployment
- SCM (+Automated Semantic Versioning) (rollback changes, reproduce issues)
- Decouple release process of the semgrep analyzer from the rules
- Easy support for air-gapped environments (the same mechanism we use for dependency scanning)
- Different licensing semgrep analyzer <> GitLab ruleset and support for copy-left licenses for rules/test-cases stored in the central repository (optional)
- Simplicity
- SAST rules as code
- Single Source of truth
- Use the same workflow we already have in place for Dependency Scanning
- Implicit documentation: every rule has to have a test-case for our automated gap analysis as well as documentation purposes.
Involved components
Tasks
-
Make sast-rules
available for rule distribution. -
Update both Dockerfile
andDockerfile.fips
to pull insast-rules
distribution during the build process. related issue: #390908 (closed) -
Remove the /rules
directory from the semgrep analyzer | #390908 (closed) -
Ensure that the primary identifiers in generated gl-sast report remains the same