Injection: GitLab Code Navigation relies solely on input validation of HTML by untrusted sources
Summary
Code Navigation is a feature that gives us "jump to definition" in files. It is gated by the :code_navigation
feature flag. WIP docs are here: !26069 (merged)
While reviewing gitlab-workhorse!492 (diffs, comment 344333731) with @igor.drozdov , I noticed that we're doing input validation of values, escaping them, and outputting them into raw HTML that is then shown directly from the file by the frontend.
This merge request simply internalises code to workhorse that is already running within CI jobs - which is to say, the input validator is itself untrusted. At the moment, CI jobs are generating HTML fragments and packaging them into JSON files in the artifact.zip
file we upload to GitLab. With the feature flag on, those fragments are being extracted from the artifact.zip
and injected directly into the frontend, leading to XSS, SSRF, etc, vulnerabilities.
Steps to reproduce
- Fork a project with code intelligence enabled
- Modify
.gitlab-ci.yml
to generate malicious code-navigation data in theartifact.zip
file - Create a merge request
- Send the link to a project member
The frontend gets a link to a .json
file inside the artifact.zip
from the backend: https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/code_navigation_path.rb#L19 ; the frontend GETs that file, and interpolates the content of some of the items directly into the page, e.g. here: https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/assets/javascripts/code_navigation/components/popover.vue#L86
UNTRUSTED CONTENT
Example Project
Not actually tested, this is dead reckoning at the moment
What is the current bug behavior?
Untrusted HTML, JS, etc, is rendered/executed.
What is the expected correct behavior?
Escaped malicious content, or better yet, nothing at all.
Output of checks
This bug happens on GitLab.com
Possible fixes
The workhorse MR at least moves the processing step to code we control and can (in theory) trust. However, I think we rely entirely on input validation at the moment, which is a worry. There's always a chance people will get in artifact.zip
files that haven't been through this step, or they
Instead, we should be able to apply some form of output validation. If we keep the architecture where the artifact.zip
contains raw HTML, this is fairly difficult. We could run it through the Banzai pipeline, perhaps?
Alternatively, we could rework this feature so it provides some sort of metadata that is converted to HTML in either the frontend or the backend, so we don't need to inject raw HTML directly into the page.
/cc @igor.drozdov @danielgruesso @gitlab-com/gl-security/appsec