feat: add SAST tool semgrep

Close #132 (closed)

This implementation is inspire with our Node.js, Go, and .NET patterns, with a few key adjustments. Due to the mandatory login requirement for semgrep ci introduced in late 2025 producing this error:

run `semgrep login` before using `semgrep ci` or use `semgrep scan` and set `--config`

I have implemented a local scanning strategy that effectively removes the requirement for a mandatory login.

Scan Logic optimisation (draws inspiration from .NET):

  • Reference Branches: Full scans are executed for $RELEASE_REF, $INTEG_REF, and $PROD_REF.
  • Feature Branches: Differential scans are performed against $CI_DEFAULT_BRANCH, with a fallback to full scans if reference not exist.

Technical Configuration:

  • Reporting: Native and GitLab-SAST formats are active; SARIF is currently disabled but semgrep have native support for this.
  • Environment: The job runs on a Semgrep Docker image while inheriting from .python-base. Since Semgrep is natively Python-based, this ensures a compatibility, even if the hybrid image setup looks unconventional.
  • Optimization: I’ve implemented a Python-based rule downloader using ETag caching. A natural synergy, given Python is a Semgrep dependency.

Next Steps:

  • I am testing a standalone Semgrep installation on a pure Python image, though I still need to integrate a maybe_install_packages logic for Git.

Feedback on this WIP is appreciated 😃

Edited by Bertrand Goareguer

Merge request reports

Loading