Exclude dirs from SAST, Dependency Scanning analysis
### Problem to solve Currently there is no way to exclude a dir from a repo from analysis. This has the implication that there will be vulnerabilities found that are not interesting for the development team ### Target audience - Delaney, Development Team Lead, https://design.gitlab.com/research/personas#persona-delaney - Sasha, Software Developer, https://design.gitlab.com/research/personas#persona-sasha ### Proposal Introduce new variables `SAST_EXCLUDED_PATHS` and `DS_EXCLUDED_PATHS` to set a list of excluded paths in SAST and Dependency Scanning, respectively. When generating a report, SAST, DS and their analyzers automatically remove all the vulnerabilities for which the location matches one of excluded paths. The filter uses the `.location.path` key of the vulnerability. `SAST_EXCLUDED_PATHS` and `DS_EXCLUDED_PATHS` act as a **post-filter**: it doesn't prevent the scanning of the excluded path but instead removes the excluded path from the generated output. It would be more efficient to filter out the excluded paths when scanning the repo but this is way more complex given the diversity of the tools SAST relies on. The post-filter is the easiest way to achieve consistency across all the analyzers. `SAST_EXCLUDED_PATHS` and `DS_EXCLUDED_PATHS` are a comma-separated list of patterns. Patterns can be globs, file or folder paths. Parent directories will also match patterns. It's important that the filter is implemented in both SAST/DS and their analyzers. This way it will benefit to customers who use the analyzer Docker images directly (e.g. without relying on the main `sast` or `dependency-scanning` image). **Out of scope**: If possible the analyzer/wrapper may leverage `SAST_EXCLUDED_PATHS` and pass it to the command line program it relies on to remove excluded paths from the scanning. In that case the environment variable would be used both as a pre-filter and a post-filter. But consistency matters and analyzers should not reuse this environment variable unless they implement the exact pattern matching. ### TODO - [x] specify the pattern of excluded path AKA glob syntax - See https://gitlab.com/gitlab-org/gitlab-ee/issues/10030#note_163363781 - [x] implement in common library - parsing of comma-separated of excluded path - matching function to tell whether a path is excluded - filtering of excluded path in analyzer - filtering of excluded path in orchestrator (SAST itself) - ~~discuss default value, if any~~ - [x] update job definition - [x] update documentation - update SAST doc - update DS doc   ### Links / references ZD https://gitlab.zendesk.com/agent/tickets/114449 A sample python project can be found at https://gitlab.com/televi/sast-issue-114449 What happens: * SAST identifies issues in the tests directory as well as in the hello_world directory What should happen: * SAST should only report issues in the hello_world directory since the tests directory is part of the default set of ignored directories
issue