Skip to content

Secret detection gem scan logic

Serena Fang requested to merge implement-secret-detection-gem-scan into master

What does this MR do and why?

Epic: Build a Ruby gem to perform secrets regex match... (&11612 - closed)

This MR adds the keyword and regex scanning logic that checks for leaked secrets.

To perform regex matching on git blobs that may include secrets, we are creating a gem that will be included as a dependency in GitLab main codebase (gitlab-org/gitlab). This dependency will accept one or more git blobs, match them against Gitlab-specific tokens (based on #427011 (comment 1646420504)), and return scan results.

Resolves Implement scanning logic in the gem (#427021 - closed)

Step Merge Request Description
1 !135510 (merged) Create an empty gem
2 This one Implement the scanning logic
3 !137812 (merged) Connect the push check to the gem

The gitlab-secret_detection gem will be called by the secrets push check, which is implemented by another series of MRs.

Step Merge Request Description
1 !135032 (merged) Adds the secrets push check, and puts it behind a feature flag.
2 !135036 (merged) Updates the secrets push check to check for license (only ultimate is allowed).
3 !135164 (merged) Adds a new application setting for pre-receive SD, and updates the secrets push check accordingly.
4 !135273 (merged) Adds the UI for toggling the application setting of pre-receive SD

Brief Explanation of the changes

I apologetically admit that 11 file changes in a single MR would be difficult for the reviewer to review. I'd like to emphasize that most of the changes are either consequential (like Gems update) or Best practices followed (like separating data objects into dedicated files). Nevertheless, here's a brief explanation of the changes introduced in this MR.

├── Gemfile.lock --> upgrade re2 and tomlrb dependencies to the latest version that contains perf improvements 
├── gems/gitlab-secret_detection
    ├── Gemfile.lock --> upgrade dependencies recommended by the AppSec team
    ├── gitlab-secret_detection.gemspec --> differentiate dev and runtime dependencies as needed
    ├── lib
    │   ├── gitlab
    │   │   ├── secret_detection
    │   │   │   ├── finding.rb  --> Plain Old Ruby Object(PORO) representing an instance of a secret finding in a blob
    │   │   │   ├── response.rb --> PORO representing the response returned by the Secret Detection scan operation
    │   │   │   ├── scan.rb --> Added implementation for running Secret Detection(SD) on the given blobs along with config options
    │   │   │   ├── status.rb --> Constants file containing Status Codes for each success/error scenario that could occur during the scan
    │   │   └── secret_detection.rb --> Imported new PORO & Constant files introduced
    │   └── gitleaks.toml --> Ruleset file consisting of regex patterns used by the Secret Detection scan operation
    └── spec
        └── lib
            └── gitlab
                ├── secret_detection
                │   └── scan_spec.rb --> Added more tests related to the new implementation
                └── secret_detection_spec.rb --> deleted this file since it was part of copy-paste from other gem used from reference 

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Serena Fang

Merge request reports