Improve quality of Advanced SAST results by analyzing benchmarking projects (#14685) · Epics · GitLab.org

Improve quality of Advanced SAST results by analyzing benchmarking projects

# Background This epic contains a set of proactive research projects to improve the real-world efficacy of the Advanced SAST ruleset. We will use intentionally-vulnerable benchmark and example applications to identify possible patterns that are not accurately detected by our current ruleset. We will use these example cases to _inform_ our analysis, but _not_ dictate it; we are still guided by our own analysis of the best customer experience. For example, we will not add rules that we judge to provide little value just because an example app asks us to detect them. # Technical details ## Process 1. Gather a list of benchmarking projects per language 2. Assign a language for each team member 3. Per project, create a GitLab issue 4. Run [analysis](https://gitlab.com/groups/gitlab-org/security-products/-/epics/7#analysis) on that project The work on this effort has been split into 2 categories: - Per programming language: the efforts to create the baseline for a given project (per language) - Globally: efforts around creating benchmarking tooling and dashboards. ### Globally * Define a global JSON schema for the programmatic representation of the expected results set (ground truth) * Define a schema and populate the engine limitations JSON file, which will be referenced in the ground truth * Create a CI job/template that will run GLAS on all the completed projects against a version of the rules. Based on the programmatic representation of the expected results set, it would return whether we have deviations in detections. * Store the benchmarking results (scanning evidence and the statistical report) in a way to would enable us to have over-time view of our accuracy across projects, and in general. ### Per programming language We are targeting _Python_, _Java_ and _Go_ as first languages. Focus on 2-3 project per language : * Complete the expected results set for each project in a given language * Close the gap to reach 100% effective accuracy (excluding engine limitations) * Translate the expected results set the programmatic form

epic