Skip to content

Set all attachments to Content-Type application/octet-stream

What does this MR do and why?

In GitLab 11.6 via gitlab-workhorse!335 (merged), Workhorse attempts to detect a blob's Content-Type by proxying the download and examining the first 512 bytes. This was done to thwart a security issue where certain types of files could trick the browser into displaying an inline file and execute (https://gitlab.com/gitlab-org/gitlab-foss/-/issues/36103).

However, the detection mechanism used relies on Go's http.DetectContentType([]byte) function, which implements the algorithm described in https://mimesniff.spec.whatwg.org/. This detection mechanism only can detect a small number of types and can cause files to be labeled with the wrong Content-Type.

For example, this Workhorse change caused Microsoft Word .docx files to have a Content-Type of application/zip instead of application/vnd.openxmlformats-officedocument.wordprocessingml.document. A similar problem exists for other documents.

It's natural to assume that we have to fix the Content-Type detection, but there's a simpler way that preserves both security and accuracy: let the browser deal with it. For example, when a .docx file is downloaded with this type, it seems that Firefox and Chrome use the Content-Disposition, URL, and Content-Type as hints about what to do with the file. Note that this is an advanced option that has to be enabled on the browser.

Relates to #26448 (closed)

How to set up and validate locally

  1. Open a project and upload a .docx file.
  2. Change to this branch.
  3. Set your browser's setting to ask what to do with attachments. In Firefox:

image

  1. The browser should show a guessed Content-Type:

image

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Stan Hu

Merge request reports

Loading