doc: Add Secret Detection Service blueprint

Relates to #376716

doc: Add Secret Detection Service blueprint
ce0d7a53 · Lucas Charles · 998cba54 · ce0d7a53
Commit ce0d7a53 authored 2 years ago by Lucas Charles 💬
--- a/doc/architecture/blueprints/secret_detection/index.md
+++ b/doc/architecture/blueprints/secret_detection/index.md
+---
+status: proposed
+creation-date: 2022-11-25
+authors: [ "@theoretick" ]
+coach: ""
+approvers: [ "@connorgilbert", "@amarpatel" ]
+owning-stage: "~devops::secure"
+participating-stages: []
+---
+
+<!--
+**Note:** Please remove comment blocks for sections you've filled in.
+When your blueprint is complete, all of these comment blocks should be removed.
+
+To get started with a blueprint you can use this template to inform you about
+what you may want to document in it at the beginning. This content will change
+/ evolve as you move forward with the proposal.  You are not constrained by the
+content in this template. If you have a good idea about what should be in your
+blueprint, you can ignore the template, but if you don't know yet what should
+be in it, this template might be handy.
+
+- **Fill out this file as best you can.** At minimum, you should fill in the
+  "Summary", and "Motivation" sections.  These can be brief and may be a copy
+  of issue or epic descriptions if the initiative is already on Product's
+  roadmap.
+- **Create a MR for this blueprint.** Assign it to an Architecture Evolution
+  Coach (i.e. a Principal+ engineer).
+- **Merge early and iterate.** Avoid getting hung up on specific details and
+  instead aim to get the goals of the blueprint clarified and merged quickly.
+  The best way to do this is to just start with the high-level sections and fill
+  out details incrementally in subsequent MRs.
+
+Just because a blueprint is merged does not mean it is complete or approved.
+Any blueprint is a working document and subject to change at any time.
+
+When editing blueprints, aim for tightly-scoped, single-topic MRs to keep
+discussions focused. If you disagree with what is already in a document, open a
+new MR with suggested changes.
+
+If there are new details that belong in the blueprint, edit the blueprint. Once
+a feature has become "implemented", major changes should get new blueprints.
+
+The canonical place for the latest set of instructions (and the likely source
+of this file) is [here](/doc/architecture/blueprints/_template.md).
+-->
+
+# Secret Detection as a platform-wide experience
+
+## Summary
+
+Today's secret detection feature is built around containerized scans of repositories
+within a pipeline context. This feature is quite limited compared to where leaks
+or compromised tokens may appear and should be expanded to include a much wider scope.
+
+Secret detection as a platform-wide experience should encompass detection across
+critical platform features with a high risk of secret leakage including repository contents,
+job logs, and project management features such as issues, epics, and merge requests.
+
+## Motivation
+
+### Goals
+
+- Support asynchronous secret detection for:
+  - push events
+  - issuable creation
+  - issuable updates
+  - issuable comments
+
+### Non-Goals
+
+The current proposal is limited to asynchronous detection and alerting only.
+
+**Blocking** secrets on push events is high-risk to a critical path and
+would require extensive performance profiling before implementing. See
+[a recent example](https://gitlab.com/gitlab-org/gitlab/-/issues/246819#note_1164411983)
+of a customer incident where this was attempted.
+
+Secret revocation and rotation is also beyond the scope of this new capability.
+
+Object types beyond the scope of this MVC include:
+
+- Media types (JPEGs, PDFs,...)
+- Snippets
+- Wikis
+
+## Proposal
+
+To achieve scalable secret detection for a variety of domain objects a dedicated
+scanning service must be created and deployed alongside the GitLab distribution.
+This will be referred to as the `SecretScanningService`.
+
+This service must be:
+
+- highly performant
+- horizontally scalable
+- generic in domain object scanning capability
+
+Platform-wide secret detection should be enabled by-default on GitLab Saas as well
+as self-managed instances.
+
+## Design and implementation details
+
+<!--
+This section should contain enough information that the specifics of your
+change are understandable. This may include API specs (though not always
+required) or even code snippets. If there's any ambiguity about HOW your
+proposal will be implemented, this is the place to discuss them.
+
+If you are not sure how many implementation details you should include in the
+blueprint, the rule of thumb here is to provide enough context for people to
+understand the proposal. As you move forward with the implementation, you may
+need to add more implementation details to the blueprint, as those may become
+an important context for important technical decisions made along the way. A
+blueprint is also a register of such technical decisions. If a technical
+decision requires additional context before it can be made, you probably should
+document this context in a blueprint. If it is a small technical decision that
+can be made in a merge request by an author and a maintainer, you probably do
+not need to document it here. The impact a technical decision will have is
+another helpful information - if a technical decision is very impactful,
+documenting it, along with associated implementation details, is advisable.
+
+If it's helpful to include workflow diagrams or any other related images.
+Diagrams authored in GitLab flavored markdown are preferred. In cases where
+that is not feasible, images should be placed under `images/` in the same
+directory as the `index.md` for the proposal.
+-->
+
+The critical paths as outlined under [goals above](#goals) cover two major object
+types: Git blobs (corresponding to push events) and arbitrary text blobs.
+
+The detection flow for push events relies on subscribing to the PostReceive hook
+and enqueueing Sidekiq requests to the `SecretScanningService`. The `SecretScanningService`
+service will fetch enqueued refs, query Gitaly for the ref blob contents, scan
+the commit contents, and notify the Rails application when a secret is detected.
+
+The detection flow for arbitary text blobs, such as issue comments, relies on
+subscribing to `Notes::PostProcessService` (or equivalent service) and enqueueing
+Sidekiq requests to the `SecretScanningService` to process the text blob by object type
+and primary key of domain object. The `SecretScanningService` service will fetch the
+relevant text blob and scan the contents, and notify the Rails application when a secret
+is detected.
+
+The detection flow for job logs requires processing the log during archive to object
+storage. See discussion [in this issue](https://gitlab.com/groups/gitlab-org/-/epics/8847#note_1116647883)
+around scanning during streaming and the added complexity around buffering and lookbacks
+for arbitrary trace chunks.
+
+In any case of detection, the Rails application will manually create a vulnerability
+using the `Vulnerabilities::ManuallyCreateService` to surface the finding within the
+existing Vulnerability Management UI.
+
+See [technical discovery](https://gitlab.com/gitlab-org/gitlab/-/issues/376716)
+for further background exploration.
+
+### Token types
+
+The existing Secret Detection configuration covers ~100 rules across a variety
+of platforms. To reduce total cost of execution and likelihood of false positives
+the dedicated service will target only well-defined tokens. A well-defined token is
+defined as a token with a precise definition, most often a fixed substring prefix or
+suffix and fixed length.
+
+Token types to identify in order of importance:
+
+1. Well-defined GitLab tokens (i.e. Personal Access Tokens and Pipeline Trigger Tokens)
+1. Verified Partner tokens (i.e. AWS)
+1. Remainder tokens currently included in Secret Detection CI configuration
+
+### Detection engine
+
+Our current secret detection offering utilizes [Gitleaks](https://github.com/zricethezav/gitleaks/)
+for all secret scanning within pipeline contexts. By using its `--no-git` configuration
+we can scan arbitrary text blobs outside of a repository context and continue to
+utilize it for non-pipeline scanning.
+
+Given our existing familiarity with the tool and its extensibility, it should
+remain our engine of choice. Changes to the detection engine are out of scope
+unless benchmarking unveils performance concerns.
+
+### Push event detection flow
+
+```mermaid
+sequenceDiagram
+    autonumber
+    actor User
+    User->>+Workhorse: git push
+    Workhorse->>+Gitaly: tcp
+    Gitaly->>+Rails: grpc
+    Sidekiq->>+Rails: poll job
+    Rails->>-Sidekiq: PostReceive worker
+    Sidekiq-->>+Sidekiq: enqueue PostReceiveSecretScanWorker
+
+    Sidekiq->>+Rails: poll job
+    loop PostReceiveSecretScanWorker
+      Rails->>-Sidekiq: PostReceiveSecretScanWorker
+      Sidekiq->>+SecretScanningSvc: ScanBlob(ref)
+      SecretScanningSvc->>+Sidekiq: accepted
+      Note right of SecretScanningSvc: Scanning job enqueued
+      Sidekiq-->>+Rails: done
+      SecretScanningSvc->>+Gitaly: retrieve blob
+      SecretScanningSvc->>+SecretScanningSvc: scan blob
+      SecretScanningSvc->>+Rails: secret found
+    end
+```
+
+## Iterations
+
+1. Requirements definition for detection coverage and actions
+1. PoC of secret scanning service
+    1. gRPC commit retrieval from Gitaly
+    1. blob scanning
+1. Implementation of secret scanning service MVC (targeting individual commits)
+1. Security and readiness review
+1. Deployment and monitoring
+1. Implementation of secret scanning service MVC (targeting arbitrary text blobs)
+1. Deployment and monitoring
+1. High priority domain object rollout (priority TBD)
+    1. Issuable comments
+    1. Issuable bodies
+    1. Job logs