Evil webhook forces connections to last forever, DoS
HackerOne report #1029269 by afewgoats
on 2020-11-08, assigned to @dcouture:
Report | Attachments | How To Reproduce
Report
Summary
Gitlab Webhooks do not properly obey timeouts, so connections can be forced to last forever.
A malicious webhook receiver can respond in a way such that:
- The HTTP(S) connection between the Gitlab sidekiq process and the webhook server NEVER ends
- Large amounts of data are kept in memory
- Giant webhook responses are stored potentially causing processes to be killed due to out-of-memory
It's not just project webhooks. Any use of Gitlab::HTTP.post
such as system webhooks and integrations (e.g. Mattermost) are affected.
Steps to reproduce
A test webhook receiver is attached. It can be run with node never-end.js
and serves on port 3000. If you want to use it, you may need to enable local webhooks. Instead, you can use my server at https://gitlab.example.com/never-end/ but please redact all mention of my domain gitlab.example.com before publishing.
First, create a project.
1. Never ending request
Create a webhook for the project with URL pointing to: https://gitlab.example.com/never-end/gitlab
Trigger the webhook (e.g. by creating a new issue if that was selected as a trigger). Don't use the webhook Test button as it times out after 60 seconds, so isn't as badly affected.
The webhook server replies by sending 1 byte every 3.21 seconds. The read timeout isn't hit because data is still being transferred, albeit very slowly.
See that:
- The sidekiq job never finishes (see /admin/background_jobs and screenshot)
- A TCP connection from the sidekiq process to the webhook server is maintained forever. This can be seen with netstat or tcpdump.
The webhook can be triggered multiple times to have multiple concurrent connections.
2. Several hundred MB of data, then never ending request
Change webhook URL to: https://gitlab.example.com/never-end/mega/gitlab
Now trigger the webhook and monitor the memory used by the sidekiq process. It will increase dramatically and stay high forever.
The webhook server replies by sending ~530MB of data and then 1 byte every 3.21 seconds. The read timeout isn't hit because data is still being transferred, albeit very slowly.
The webhook can be triggered multiple times to have multiple concurrent connections and further increase the memory wastage.
3. Several hundred MB of data, then request ends
This isn't timeout related, but more to do with storing large webhook responses.
Change webhook URL to: https://gitlab.example.com/never-end/mega/exit/gitlab
This time, the request will end successfully after a few seconds. The response size is ~530MB. It is stored in postgres in the webhook_logs table.
Go to edit the webhook and scroll down to find the request in "Recent Deliveries". Click "View details" (link to hook_logs) and it tries to show the full HTTP body. Puma worker is probably killed due to running out of memory.
By making several simultaneous requests to view the hook logs, my computer pretty much set on fire with all CPU cores at 100% so I had to force reboot it e.g.:
for i in `seq 1 15`; do curl 'https://GITLABSERVER/gitlab-instance-027a8c7d/monitoring/hooks/1/hook_logs/3' \
-H 'cookie: ...ENTER COOKIE...' \
--compressed \
--insecure > /dev/null &; done
When I increased the response size of this endpoint to 666MB I got this error in postgres, but possibly just due to my setup:
ERROR: invalid memory alloc request size 1073741824
STATEMENT: INSERT INTO "web_hook_logs" ("web_hook_id", "trigger", "url", "request_headers", "request_data", "response_headers", "response_body", "response_status", "execution_
duration", "created_at", "updated_at") VALUES (1, 'issue_hooks', 'http://192.168.4.20:3000/mega/exit', '---
Content-Type: application/json
X-Gitlab-Event: Issue Hook
...
Examples
Most testing was done on my docker instance.
I added a webhook to https://gitlab.com/afewgoatstest/mytestproject/ (not public) and launched a request: there is currently a never-ending connection from 34.74.90.72 to my server. I didn't try the /mega versions which send hundreds of MB on gitlab.com to avoid disrupting service.
What is the current bug behavior?
There is no overall timeout on the webhook connection except when using the Test Webhook function. (Also accepts huge response data)
What is the expected correct behavior?
After 10 seconds (or some other reasonable timeout), webhook requests timeout even if data is still being transferred.
Relevant logs and/or screenshots
Stack trace where the sidekiq process gets stuck:
lib/gitlab/http.rb:44:in `perform_request'
app/services/web_hook_service.rb:87:in `make_request'
app/services/web_hook_service.rb:39:in `execute'
app/workers/web_hook_worker.rb:16:in `perform'
lib/gitlab/metrics/sidekiq_middleware.rb:18:in `block in call'
lib/gitlab/metrics/transaction.rb:61:in `run'
lib/gitlab/metrics/sidekiq_middleware.rb:18:in `call'
lib/gitlab/sidekiq_middleware/duplicate_jobs/strategies/until_executing.rb:32:in `perform'
lib/gitlab/sidekiq_middleware/duplicate_jobs/duplicate_job.rb:40:in `perform'
lib/gitlab/sidekiq_middleware/duplicate_jobs/server.rb:8:in `call'
lib/gitlab/sidekiq_middleware/worker_context.rb:9:in `wrap_in_optional_context'
lib/gitlab/sidekiq_middleware/worker_context/server.rb:17:in `block in call'
lib/gitlab/application_context.rb:54:in `block in use'
lib/gitlab/application_context.rb:54:in `use'
lib/gitlab/application_context.rb:21:in `with_context'
lib/gitlab/sidekiq_middleware/worker_context/server.rb:15:in `call'
lib/gitlab/sidekiq_status/server_middleware.rb:7:in `call'
lib/gitlab/sidekiq_versioning/middleware.rb:9:in `call'
lib/gitlab/sidekiq_middleware/admin_mode/server.rb:8:in `call'
lib/gitlab/sidekiq_middleware/instrumentation_logger.rb:7:in `call'
lib/gitlab/sidekiq_middleware/batch_loader.rb:7:in `call'
lib/gitlab/sidekiq_middleware/extra_done_log_metadata.rb:7:in `call'
lib/gitlab/sidekiq_middleware/request_store_middleware.rb:10:in `block in call'
lib/gitlab/with_request_store.rb:17:in `enabling_request_store'
lib/gitlab/with_request_store.rb:10:in `with_request_store'
lib/gitlab/sidekiq_middleware/request_store_middleware.rb:9:in `call'
lib/gitlab/sidekiq_middleware/server_metrics.rb:35:in `call'
lib/gitlab/sidekiq_middleware/monitor.rb:8:in `block in call'
lib/gitlab/sidekiq_daemon/monitor.rb:49:in `within_job'
lib/gitlab/sidekiq_middleware/monitor.rb:7:in `call'
lib/gitlab/sidekiq_logging/structured_logger.rb:18:in `call'
perform_request
calls HTTParty which is probably where the issue is.
The Mattermost integration stack trace is similar:
lib/gitlab/http.rb:44:in `perform_request'
app/models/project_services/slack_service.rb:55:in `post'
app/models/project_services/slack_service.rb:45:in `notify'
app/models/project_services/chat_notification_service.rb:94:in `execute'
lib/gitlab/metrics/instrumentation.rb:160:in `block in execute'
lib/gitlab/metrics/method_call.rb:27:in `measure'
lib/gitlab/metrics/instrumentation.rb:160:in `execute'
app/workers/project_service_worker.rb:13:in `perform'
lib/gitlab/metrics/sidekiq_middleware.rb:18:in `block in call'
...
Output of checks
This bug happens on GitLab.com and in docker.
Results of GitLab environment info
Using gitlab-ee docker image
System information
System:
Proxy: no
Current User: git
Using RVM: no
Ruby Version: 2.6.6p146
Gem Version: 2.7.10
Bundler Version:1.17.3
Rake Version: 12.3.3
Redis Version: 5.0.9
Git Version: 2.28.0
Sidekiq Version:5.2.9
Go Version: unknown
GitLab information
Version: 13.5.3-ee
Revision: b9d194b6b91
Directory: /opt/gitlab/embedded/service/gitlab-rails
DB Adapter: PostgreSQL
DB Version: 11.9
URL: https://gitlab.example.com
HTTP Clone URL: https://gitlab.example.com/some-group/some-project.git
SSH Clone URL: git@gitlab.example.com:some-group/some-project.git
Elasticsearch: no
Geo: no
Using LDAP: no
Using Omniauth: yes
Omniauth Providers:
GitLab Shell
Version: 13.11.0
Repository storage paths:
- default: /var/opt/gitlab/git-data/repositories
GitLab Shell path: /opt/gitlab/embedded/service/gitlab-shell
Git: /opt/gitlab/embedded/bin/git
Impact
Denial of service.
Can keep never-ending HTTP(S) connections open, deplete connection pool, reserve memory, slow things down, kill process due to out-of-memory, maybe kill server.
The attacker can be either:
- Someone who can add webhooks to a project (or system webhook or integration)
- The host of a server which has a webhook pointed to it
Attachments
Warning: Attachments received through HackerOne, please exercise caution!
How To Reproduce
Please add reproducibility information to this section:
- Host the
never-end.js
script on a server - Create a webhook that sends a request to
http://server:3000/
, orhttp://server:3000/mega
orhttp://server:3000/mega/exit
depending on the scenario you're testing - Observe sidekiq jobs stuck forever and resource consumption