Skip to content

DoS by repeatedly sending unauthenticated requests for diff-files of a commit or merge request

Please read the process on how to fix security issues before starting to work on the issue. Vulnerabilities must be fixed in a security mirror.

HackerOne report #2650086 by a92847865 on 2024-08-11, assigned to @kmorrison1:

Report | Attachments | How To Reproduce

Report

NOTE! Thanks for submitting a report! Please note that initial triage is handled by HackerOne staff. They are identified with a HackerOne triage badge and will escalate to the GitLab team any. Please replace all the (parenthesized) sections below with the pertinent details. Remember, the more detail you provide, the easier it is for us to triage and respond quickly, so be sure to take your time filling out the report!

Summary

Diff_files of a commit or merge request can be requested using below endpoints:

Generating diff_files is CPU-time consuming task, especially for syntax highlighting part. There is a "time-out" (1.5 sec) which should be a sufficient protection against uncontrolled resource consumption.

def highlight_rich(text, continue: true)  
      tag = lexer.tag  
      tokens = lexer.lex(text, continue: continue) # [@]reporter: lack of time-out here  
      Gitlab::RenderTimeout.timeout { [@]formatter.format(tokens, **context, tag: tag).html_safe }  
   .....  
    end  

But that protection unfortunately is kinda useless, for two reasons:

  • Firstly, Gitlab::RenderTimeout.timeout only applies for formatting step, not applies for tokenization step.
  • Secondly, for diff-files requests (of a commit or merge request), there's currently no "time-out" protection for the whole process of generating several (many) diff-files.

Attack scenario

  • Attacker looks for big commits/merge-requests of a public project. For example: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/32372/diffs_batch.json. It takes Gitlab.com server ~17-18 seconds to respond to a request for "diffs_batch.json" file of that merge_request.
  • Attacker repeatedly sends unauthenticated requests to keep all CPUs of Gitlab instance being busy --> unreachable for legitimate users.
  • The number of malicious requests per second is low since it takes tens of seconds to handle a malicious request. For example: 1 RPS (5% peak load) for a "1k-users Gitlab EE instance" (peak load: 20 RPS for normal requests)
Steps to reproduce

(Step-by-step guide to reproduce the issue, including:)

  1. Register a "1k-users Gitlab" VM (8 vCPU, 16GB RAM). Install latest Gitlab EE.
  2. Create a public project "DoS" in public group "Test".
  3. Extract this diff.zip and create a commit to add 21 extracted files to project "DoS".
  4. Open Web Developer Tools, click to view above commit, replace necessary fields in below URL:
    http(s)://[host-name]/[group]/[project]/-/commit/[:sha]/diff_files?expanded=1.
  5. Log-out of your account, paste above URL in a new tab (with necessarily replaced fields), 6. Open Web Developer Tools, click enter. Then get equivalent curl command by right click on relevant networking item in networking tab of Web Developer Tools.
  6. Use this shell script diff.zip to send malicious requests to Gitlab instance. Replace: [url] and curl command headers with data you collected from steps 4-6.
  7. SSH into "1k-users Gitlab" VM. You should see that all CPUs are at ~ 100% utilized rate.
  8. Open your browser to connect to Gitlab instance. Gitlab instance is unusable.
  9. Stop sending malicious requests. It takes some minutes for Gitlab instance to respond normally to legitimate users.
Impact

Attacker can make a Gitlab public instance like Gitlab.com unreachable to legitimate users by repeatedly sending unauthenticated requests for diff-files

Examples

(If the bug is project related, please create an example project and export it using the project export feature)

(If you are using an older version of GitLab, this will also help determine whether the bug has been fixed in a more recent version)

(If the bug can be reproduced on GitLab.com without violating the Rules of Engagement as outlined in the program policy, please provide the full path to the project.)

What is the current bug behavior?

(What actually happens, include relevant screenshots, API results, or complete HTTP requests)

There's no "time-out" protection for below endpoints:

What is the expected correct behavior?

(What you should see instead, include relevant screenshots, API results, or complete HTTP requests)

Set "time-out" protection for below endpoints:

Relevant logs and/or screenshots

(Paste any relevant logs - please use code blocks (```) to format console output,
logs, and code as it's very hard to read otherwise.)

Output of checks

This bug happens on both Gitlab self-managed and GitLab.com

Results of GitLab environment info

(For installations with omnibus-gitlab package run and paste the output of:
sudo gitlab-rake gitlab:env:info)

(For installations from source run and paste the output of:
sudo -u git -H bundle exec rake gitlab:env:info RAILS_ENV=production)

System information  
System:         Ubuntu 20.04  
Proxy:          no  
Current User:   git  
Using RVM:      no  
Ruby Version:   3.1.5p253  
Gem Version:    3.5.11  
Bundler Version:2.5.11  
Rake Version:   13.0.6  
Redis Version:  7.0.15  
Sidekiq Version:7.1.6  
Go Version:     unknown

GitLab information  
Version:        17.2.2-ee  
Revision:       49db6c4d6ac  
Directory:      /opt/gitlab/embedded/service/gitlab-rails  
DB Adapter:     PostgreSQL  
DB Version:     14.11  
URL:            http://34.68.164.218  
HTTP Clone URL: http://34.68.164.218/some-group/some-project.git  
SSH Clone URL:  git@34.68.164.218:some-group/some-project.git  
Elasticsearch:  no  
Geo:            no  
Using LDAP:     no  
Using Omniauth: yes  
Omniauth Providers: 

GitLab Shell  
Version:        14.37.0  
Repository storages:  
- default:      unix:/var/opt/gitlab/gitaly/gitaly.socket  
GitLab Shell path:              /opt/gitlab/embedded/service/gitlab-shell

Gitaly  
- default Address:      unix:/var/opt/gitlab/gitaly/gitaly.socket  
- default Version:      17.2.2  
- default Git Version:  2.45.2  
PoC video:

Gitlab_DOS.mp4

Impact

Attacker can make a Gitlab public instance like Gitlab.com unreachable to legitimate users by repeatedly sending unauthenticated requests for diff-files (which takes tens of seconds of CPU time to generate) of a commit or merge request on a public repository. The number of malicious requests for taking a "1k-users Gitlab EE" instance is low (~1RPS)

Attachments

Warning: Attachments received through HackerOne, please exercise caution!

How To Reproduce

Please add reproducibility information to this section:

DRI's summary, notes, and resources

Summary

Resources

  • A previous issue mentions our current handling of this was temporary and we may be able to use something different now.

Notes

  • Per this comment, it has been confirmed locally that it takes "significant time to render diff_files?expanded=1 endpoint."

Tracing code path from controller to highlight

  • Projects::CommitController#diff_files
    • defines @diffs which are instantiated in define_commit_vars @diffs = commit.diffs(opts)
    • they are passed to render template: 'projects/commit/diff_files'
  • projects/commit/diff_files processes the diffs into diff_files
    • diff_files = conditionally_paginate_diff_files(diffs, paginate: true, page: params[:page], per: Projects::CommitController::COMMIT_DIFFS_PER_PAGE)

      • conditionally_paginate_diff_files returns diffs.diff_files

      • diff.diff_files appears to return a collection of Gitlab::Diff::File.new

      • so in diff_files we pass a collection on Gitlab::Diff::Files via render partial: 'projects/diffs/file', collection: diff_files, as: :diff_file

      • in app/views/projects/diffs/_file.html.haml we = render 'projects/diffs/content', diff_file: diff_file

      • in = render 'projects/diffs/content', diff_file: diff_file all paths end up rendering projects/diffs/viewer

        • before rendering though, part of it's logic is figuring out the viewer, which gets used in the next partial
      • projects/diffs/viewer renders viewer.partial_path, viewer: viewer

      • I think in the case we are working on app/views/projects/diffs/viewers/_text.html.haml is used.

      • It renders = render "projects/diffs/text_file", diff_file: diff_file, total_lines: total_lines

      • app/views/projects/diffs/_text_file.html.haml does this:

          - diff_file.highlighted_diff_lines.each do |line|
          - line_code = diff_file.line_code(line)
      • that's defined in lib/gitlab/diff/file.rb

        def highlighted_diff_lines
          @highlighted_diff_lines ||=
            Gitlab::Diff::Highlight.new(self, repository: self.repository).highlight
        end
      • highlight is defined in lib/gitlab/diff/highlight.rb

        def highlight
          populate_marker_ranges
        
          @diff_lines.map do |diff_line|
            diff_line = diff_line.dup
            # ignore highlighting for "match" lines
            next diff_line if diff_line.meta?
        
            rich_line = apply_syntax_highlight(diff_line) # This is the key line
            rich_line = apply_marker_ranges_highlight(diff_line, rich_line)
        
            diff_line.rich_text = rich_line
        
            diff_line
          end
        end

        apply_syntax_highlight is also defined in lib/gitlab/diff/highlight.rb

        def apply_syntax_highlight(diff_line)
          highlight_line(diff_line) || ERB::Util.html_escape(diff_line.text)
        end
        def highlight_line(diff_line)
          return unless diff_file && diff_file.diff_refs
          return diff_line_highlighting(diff_line, plain: true) if blobs_too_large?
        
          if Feature.enabled?(:diff_line_syntax_highlighting, project)
            diff_line_highlighting(diff_line)
          else
            blob_highlighting(diff_line)
          end
        end
        def diff_line_highlighting(diff_line, plain: false)
          rich_line = syntax_highlighter(diff_line).highlight(   # Notice we're calling highlight
            diff_line.text(prefix: false),
            plain: plain,
            context: { line_number: diff_line.line },
            used_on: :diff
          )
        
          # Only update text if line is found. This will prevent
          # issues with submodules given the line only exists in diff content.
          if rich_line
            line_prefix = diff_line.text =~ PREFIX_REGEXP ? Regexp.last_match(1) : ' '
            rich_line.prepend(line_prefix).concat("\n")
          end
        end
      • So what was the syntax_highlighter.highlight? Turns out that's Gitlab::Highlight, which is where our investigation started.

        def syntax_highlighter(diff_line)
          path = diff_line.removed? ? diff_file.old_path : diff_file.new_path
        
          @syntax_highlighter ||= {}
          @syntax_highlighter[path] ||= Gitlab::Highlight.new(    # Back to where we started investigating
            path,
            @raw_lines,
            language: repository&.gitattribute(path, 'gitlab-language')
          )
        end

      The only slightly fuzzy part for me is the last couple steps of the partials. There's a lot of conditional logic in the templates controlling things, but I think this is right.

Edited by Hunter Stewart