Document and add tests for semgrepignore file behaviour for GitLab Semgrep and GitLab Advanced Sast

Proposal

While working on Remove confusing semgrepignore comment (gitlab-org/security-products/analyzers/semgrep!526 - merged) • Adam Cohen • 17.6, I noticed that we haven't documented the behaviour of the semgrepignore file for GitLab semgrep or GitLab Advanced SAST.

The purpose of this issue is to:

  1. Add documentation to the SAST docs to explain how the semgrepignore file works in GitLab Semgrep and GitLab Advanced SAST.
  2. Add integration tests to GitLab Semgrep and GitLab Advanced SAST which demonstrate the behaviour of the ignore files.

Further details

According to Understand Semgrep defaults, if a user-generated .semgrepignore file does not exist in the repository's root directory or the project's working directory, then the following ignore files will be used:

For the GitLab Semgrep and GitLab Advanced SAST analyzers, we have the following behaviour:

  • GitLab Semgrep

    • We've added a user-generated, GitLab-specific ignore file with a custom name: semgrepignore, and we've configured the SEMGREP_R2C_INTERNAL_EXPLICIT_SEMGREPIGNORE variable to point to this custom file.

    • Our custom semgrepignore defines a number of default patterns and uses the :include .semgrepignore directive, which also allows users to provide a custom .semgrepignore file.

    • We've also disabled the ability to use the repository's .gitignore file by passing the --no-git-ignore to the semgrep binary.

  • GitLab Advanced SAST

    • Does not use a GitLab-specific ignore file, therefore it uses Semgrep's default .semgrepignore file.
    • Does not disable the repository's .gitignore file.

Implementation Plan

TBD

/cc @connorgilbert @thiagocsf