What artifacts will be scanned for secrets in job artifacts?
Goal
- Determine a list of high value report and file types to scan for secrets in job artifacts. This includes things like environment variables, logs, IAC, and test files.
- The final deliverable will be a table that we can use for documentation. I could look something like this:
| Report Type | Supported File Names | Supported File Types | Size Limit |
|---|---|---|---|
Please reconcile artifact report types with supported files that we want to scan for in secret detection for Job Artifacts.
- If a file will appear in a AST vulnerability in the pipeline it doesn't need to be included, i.e. secret_detection artifact don't need to be scanned.
- If a report type likely doesn't include secret information, we don't scan for secrets either, i.e.
accessibility.
List of supported files for Experiment
Source code files
Most of artifacts are generated as an archive (.zip) file that can include source code files among other types of files. Therefore, it makes sense to support most popular programming languages\[^1\] since a repository is essentially composed of source code files.
| Description | File Extension | Notes |
|---|---|---|
| Python | .py, .pyx | The .pyx file format is used in Cython (a variant of Python compiled to C). Other python related files include .pyd, .pyo, and .pyc, but those are compiled into bytecode, so those can be skipped since we don't plan to scan binary files at this point. |
| JavaScript | .js, .jsx, .cjs, .mjs | The .cjs and .mjs file formats for CommonJS module and ES module, respectively. The .jsx file format is often used for React-based files. |
| Java | .java | - |
| C | .c, .h | Header file .h is also considered a source file. |
| C++ | .cpp, .hpp | Header file .hpp is also considered a source file. |
| C# | .cs | - |
| PHP | .php | - |
| TypeScript | .ts, .tsx | The .tsx file format is similar to .jsx but for TypeScript. |
| Ruby | .rb, .erb | The .erb file format is used for Embedded Ruby. |
| Swift | .swift | - |
| Go | .go | - |
| Rust | .rs | - |
| Kotlin | .kt, .kts | The .kts file format is used for Kotlin Scripting. |
Documentation files
These files are almost always text-based\[^2\], so they can be easily scanned without issues.
| Description | File Extension | Notes |
|---|---|---|
| AsciiDoc | .adoc, .asciidoc | - |
| Text | .txt, .rtf | - |
| Markdown | .markdown, .md, .mdx | - |
| HTML | .htm, .html, .xhtml | - |
| UML Diagrams | .uml | - |
| API Blurprint | .apib | - |
| Changelog Files | .changelog | - |
| Dependency Documentation | .deps | - |
| Graphviz | .dot, .gv | - |
| Java Documentation Files | .javadoc | - |
| Javascript Documentation Files | .jsdoc | - |
| Mermaid Diagrams | .mermaid, .mmd | - |
| RESTful API Modeling Language | .raml | - |
| Ruby Documentation Files | .rdoc | - |
| reStructuredText Files | .rst | - |
Security Scan Results
Most of the security scanning tools format their reports into one of the following file formats.
| Description | File Extension | Notes |
|---|---|---|
| Static Analysis Results Interchange Format | .sarif | - |
| JSON | .json | - |
| XML | .xml | - |
| CSV | .csv | - |
Test Reports
Test reports are usually generated by code coverage tools. Some other file formats are produced when tools run tests and output the results into separate files.
| Description | File Extension | Notes |
|---|---|---|
| Code Coverage | .coverage, .lcov, .gcov, .clover, .xml | - |
| Test Results | .junit, .junit.xml, .nunit, .xunit | - |
Log Files
| Description | File Extension | Notes |
|---|---|---|
| Generic Log Files | .log, .info, .debug | - |
| Build Process Logs | .build | - |
| Standard Output Logs | .stdout, .out | - |
| Error Logs | .stderr, .err | - |
| Detailed Trace Logs | .trace | - |
Database Migration Scripts
These kind of migration scripts or generated schema could be included in artifacts of a CI/CD job as well.
| Description | File Extension | Notes |
|---|---|---|
| SQL | .sql, .up.sql, .down.sql, .schema, .ddl, .pgsql, .psql, .mssql, .mysql, .mariadb | - |
| Database Markup Language | .dbml | - |
Static Assets
Static assets can be any type of text-based files compressed into a a job artifact. This exclude images, and other types of media.
| Description | File Extension | Notes |
|---|---|---|
| Scalable Vector Graphics | .svg | Since .svg is essentially a XML file, it can be scanned. |
| Stylesheets | .css, .scss, .sass, .less, .stylus, .css.map, .min.css | - |
| Javascript Files | .js.map, .min.js, .mjs, .cj | - |
Other Generic Files
| Description | File Extension | Notes |
|---|---|---|
| Metadata Files | .meta, .buildinfo, .pom | - |
| Build Manifest Files | .manifest | - |
| Checksum Files | .checksum | - |
| Dependency Lock Files | .lock, .lockfile | - |
| Configuration Files | .properties, .env, .toml, .yaml, .yml, .json, .ini, .conf, .config, .htaccess | - |
| Ignore Files | .gitignore, .dockerignore | - |
| Version Files | .version | - |
| Signature and Verification Files | .asc, .sig, .pub, .crt, .pem, .pgp, .sbom, .spdx | - |
GitLab Artifact Reports
In addition the list above, we should also scan the following text-based artifact reports:
| Description | File Extension | Notes |
|---|---|---|
sast |
.json | Security Report. EE-only. |
secret_detection |
.json | Security Report. EE-only. |
dependency_scanning |
.json | Security Report. EE-only. |
container_scanning |
json | Security Report. EE-only. |
cluster_image_scanning |
.json | Security Report. EE-only. |
dast |
.json | Security Report. EE-only. |
license_scanning |
.json | License Scanning Report. EE-only. |
accessibility |
.json | Accessibility Report. |
codequality |
.json | Code Quality Report. EE-only. |
performance |
.json | Performance Report. EE-only until %13.2. |
browser_performance |
.json | Browser Performance Report. EE-only. |
load_performance |
.json | Load Performance Report. EE-only. |
terraform |
.json | Terraform/OpenTofu Plan File. EE-only. |
requirements |
.json | Project Requirements File. Deprecated-soon. EE-only. |
requirements_v2 |
.json | Project Requirements File. |
coverage_fuzzing |
.json | Security Report. EE-only. |
api_fuzzing |
.json | Security Report. EE-only. |
Report Types
| # | Report Type | Description | Max file size | Median Size (MB) | Incude in scanning |
|---|---|---|---|---|---|
| 1 | accessibility | Reports on the accessibility impact of changes introduced in merge requests. | 100 MB | No | |
| 2 | annotations | Attached to a job to add a link to the job output page. | 100 MB | No | |
| 3 | api_fuzzing | The api_fuzzing` report collects I Fuzzing bugs](https://docs.gitlab.com/user/application_security/api_fuzzing/) as as artifacts. | 100 MB | No | |
| 4 | archive | 100 MB | |||
| 5 | browser_performance | The browser_performance report collects Browser Performance Testing metrics as an artifact. This artifact is a JSON file output by the Sitespeed plugin. |
100 MB | No | |
| 6 | cluster_image_scanning | 100 MB | |||
| 7 | cobertura (coverage_report) | View test coverage results in merge requests, line-by-line coverage in file diffs, and overall metrics. | 100 MB | Under consideration | |
| 8 | code_quality | | The codequality report collects ode quality issues](https://docs.gitlab.com/ci/testing/code_quality/). . | |
100 MB | ||
| 9 | container_scanning | Report collects Container Scanning vulnerabilities. | 100 MB | No | |
| 10 | coverage_fuzzing | Report collects coverage fuzzing bugs. | 100 MB | No | |
| 11 | cyclonedx | This report is a Software Bill of Materials describing the components of a project following the CycloneDX](https://cyclonedx.org/docs/1.4) protocol format | 5 MB | No | |
| 12 | dast | The dast report collects DAST vulnerabilities. |
100 MB | No | |
| 13 | dependency_scanning | The dependency_scanning report collects Dependency Scanning vulnerabilities. |
100 MB | No | |
| 14 | dotenv | The dotenv report collects a set of environment variables as artifacts. | 100 MB | Yes | |
| 15 | jacoco | 100 MB | |||
| 16 | junit | The junit report collects JUnit report format XML files. This is a collection of unit test reports. | 100 MB | Yes | |
| 17 | license_scanning | 100 MB | |||
| 18 | load_performance | The load_performance report collects Load Performance Testing metrics. | 100 MB | No | |
| 19 | lsif | 200 MB | |||
| 20 | metadata | 100 MB | |||
| 21 | metrics | You can configure your job to use custom Metrics Reports, and GitLab displays a report on the merge request so that it’s easier and faster to identify changes without having to check the entire log. | 100 MB | ||
| 22 | metrics_referee | 100 MB | |||
| 23 | performance | 100 MB | |||
| 24 | repository_xray (deprecated) | The repository_xray report collects information about your repository for use by GitLab Duo Code Suggestions. | 100 MB | No | |
| 25 | requirements | 100 MB | |||
| 26 | requirements_v2 | 100 MB | |||
| 27 | sast | The sast report collects SAST vulnerabilities. |
100 MB | No | |
| 28 | secret_detection | The secret-detection report collects detected secrets. |
100 MB | Yes | |
| 29 | terraform | The terraform report obtains an OpenTofu tfplan.json file. |
5 MB | Yes | |
| 30 | trace | 100 MB |
-
archive- shows up when a job uploads at least one artifact -
metadata- shows up when a job uploads at least one artifact. metadata has information about the entries in the artifact archive -
trace- always shows up for every job, with some delay