Generic secret detection
### Release notes GitLab Secret Detection now offers generic secret detection capabilities to identify unstructured secrets, passwords, and high-entropy strings that don't follow well known patterns. This feature complements existing regex, pattern based detection with entropy-based scanning to catch secrets that traditional regex patterns miss. This feature addition significantly improves security coverage for development teams and organizations. ### Problem to solve As a security engineer, I need to detect internal credentials and generic secrets that don't follow known provider patterns, so I can prevent data breaches from the significant number of secrets that current pattern-based scanners miss. ### Intended users * [Amy (Application Security Engineer)](https://handbook.gitlab.com/handbook/product/personas/#amy-application-security-engineer) * [Sasha (Software Developer)](https://handbook.gitlab.com/handbook/product/personas/#sasha-software-developer) * [Alex (Security Operations Engineer)](https://handbook.gitlab.com/handbook/product/personas/#alex-security-operations-engineer) * [Delaney (Development Team Lead)](https://handbook.gitlab.com/handbook/product/personas/#delaney-development-team-lead) ### User experience goal The user should be able to use GitLab Secret Detection to automatically identify generic secrets and passwords in their code without needing to write custom patterns. These patterns have false positive rates below 15% to minimize unnecessary noise. - Generic secrets are detected alongside our current, regex-based ruleset. - They are able to easily be _excluded_ in configuration profiles. - Validity checks are not in scope. - TBD if generic secrets are available for secret push protection. ### Proposal Users can detect the following types of patterns using GitLab secret detection: #### High-entropy strings Examples: * "fingerprint": "`19c01cb7157e4645e9e2c863062a85a8cbfbdcda`" * string public constant fixedMetadataHash = "`QmRad1vxT3soFMNx9j3bBmkABb4C86anY1f5XeonosHy3m`" * totpSecret: `IFTXE3SPOEYVURT2MRYGI52TKJ4HC3KH` * localStorage.setItem('token', `'eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJkYXRhIjp7fX0.bVBhvll6IaeR3aUdoOeyR8YZe2S2DfhGAxTGfd9enLw'`) * address: `'Xr556RzuwX6hg5EGpkybbv5RanJoZN17kW'` * return frisby.put(REST_URL + '/continue-code-fixIt/apply/`y28BEPE2k3yRrdz5p6DGqJONnj41n5UEWawYWgBMoVmL79bKZ8Qve0Xl5QLW'`) * const testResponse = `'3be2e438b7f3d04c89d7749f727bb3bd'` #### Generic passwords with contextual indicators Examples: * `admin_password=Password123!` * `database_password=MySecurePass2024` * `smtp_password=EmailAuth456$` * `DB_PASS=Welcome123` * `ADMIN_PWD=SuperSecret789!` * `SERVICE_PASSWORD=TempPass2024#` * `MYSQL_ROOT_PASSWORD=RootPass123` * `POSTGRES_PASSWORD=DbAdmin456!` #### Token-like strings Examples: * JWT tokens: `eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c` * Base64 encoded values: * `SECRET_TOKEN=YWRtaW46cGFzc3dvcmQxMjM=` * `API_SECRET=VGhpcyBpcyBhIHNlY3JldCBtZXNzYWdl` * `AUTH_HEADER=Basic YWRtaW46c3VwZXJzZWNyZXQ=` * Hex strings: * `ENCRYPTION_KEY=a1b2c3d4e5f6789012345678901234567890abcdef1234567890abcdef123456` * `SESSION_SECRET=f47ac10b58cc4372a5670e02b2c3d479 HMAC_SECRET=3b7e72f9c8a5d6e4f1a2b3c4d5e6f7g8h9i0j1k2l3m4n5o6p7q8r9s0t1u2v3w4x5y6z7` * OAuth tokens: * `ACCESS_TOKEN=ya29.a0AfH6SMC7vK9j8L3m2N4o5P6q7R8s9T0u1V2w3X4y5Z6a7B8c9D0e1F2g3H4i5J6k7L8m9N0o1P2q3R4s5T6u7V8w9X0y1Z2` * `REFRESH_TOKEN=1//04z9xK7vL8m9N0o1P2q3R4s5T6u7V8w9X0y1Z2a3B4c5D6e7F8g9H0i1J2k3L4m5N6o7P8q9R0s1T2u3V4w5X6y7Z8a9B0c1D2e3F4` ### Documentation **Required Documentation:** - Update Secret Detection documentation to include generic detection capabilities - Update documentation for new detection types on the [detected secrets](https://docs.gitlab.com/user/application_security/secret_detection/detected_secrets/) page ### Available Tier This is a ~"GitLab Ultimate" feature. ### Feature Usage Metrics **Tracking Metrics:** - Number of generic secrets detected per repository, per namespace - False positive rate below 15% - (Internal) reduction in security incidents from missed secrets - Configuration usage (threshold adjustments, rule customization) **Success Indicators:** - Low false positive dismissal rates - Positive user feedback on detection accuracy ### What does success look like, and how can we measure that? **Success Metrics:** - **Adoption**: 40% of ~"GitLab Ultimate" users enable the analyzer that includes generic secret detection within 6 months - **Accuracy**: False positive rate below 15% - **Coverage**: Detect 3x more unique secrets compared to regex, pattern based detection alone - **Performance**: Less than 20% increase in scanning time **Acceptance Criteria:** - Generic secret detection successfully identifies high-entropy strings and generic passwords - Context analysis reduces false positives by at least 50% compared to pure entropy scanning ### What is the type of buyer? Security-conscious development teams and enterprises with mature DevSecOps practices who need comprehensive secret detection coverage beyond standard patterns. ### Competitive Landscape **Competitive Advantages:** - **Integrated Experience**: Seamless integration with existing GitLab Secret Detection and security workflows - **Added Context**: Advanced context analysis to reduce false positives beyond simple entropy scanning **Differentiation from Competitors:** - Unlike standalone tools, provides integrated DevSecOps experience - Advanced context analysis beyond basic entropy thresholds - Native integration with GitLab's vulnerability management system - Unified security dashboard and reporting ### Links / references - [Modernizing Secrets Scanning: Part 1–the Problem](https://hackernoon.com/modernizing-secrets-scanning-part-1-the-problem) (Hackernoon) - [DeepSecrets - a better tool for secret scanning](https://github.com/ntoskernel/deepsecrets) (OSS GitHub Repository) - [Finding leaked passwords with AI: How we built Copilot secret scanning](https://github.blog/engineering/platform-security/finding-leaked-passwords-with-ai-how-we-built-copilot-secret-scanning/) (GitHub Blog) - [Meet Nosey Parker — An Artificial Intelligence Based Scanner That Sniffs Out Secrets](https://www.praetorian.com/blog/nosey-parker-ai-secrets-scanner-release/)
epic