Skip to content

Slow regular expression in `ee/lib/gitlab/elastic/search_results.rb`

In Set Global timeout for Regexp to prevent ReDOS (!145679 - merged) we introduced a 50 second global timeout to all Regexp to prevent Regular Expression Denial of Service issues.

In the last 2 days we've seen 4 errors in https://new-sentry.gitlab.net/organizations/gitlab/issues/1082668/events/8089049393df42738db2e0ae32b503a4/

This is the line https://gitlab.com/gitlab-org/gitlab/-/blob/73c4431aad4148a736bc2a33678cc693ed3c5c18/ee/lib/gitlab/elastic/search_results.rb#L44

        matched_lines_count = highlight_content
          .scan(/#{::Elastic::Latest::GitClassProxy::HIGHLIGHT_START_TAG}(.*)\R/o).size

This is the constant definition https://gitlab.com/gitlab-org/gitlab/-/blob/c7e7f09305f7a74eab30ab4c85b523a5693f0c78/ee/lib/elastic/latest/git_class_proxy.rb#L8

      HIGHLIGHT_START_TAG = 'gitlabelasticsearch→'

My guess is that this is slow when the text returned is especially large. I think using RE2 would resolve the issue, however a the \R we're using doesn't exist in RE2 so we'd have to figure out the exact replacement.

Alternatively, could this possibly be written without using regular expressions?

Labeling as bugvulnerability as it could be abused to hurt GitLab performance but with the timeout keeping the execution under control we can security-fix-in-public .

Edited by Ravi Kumar