Skip to content

gitlab-pages returns empty Content-Encoding for certain artifacts, which confuses Chromium-based browsers

gitlab-pages adds an empty Content-Encoding header to proxied artifacts when the upstream server doesn't return compressed data (and thus Content-Encoding header isn't present in upstream's response).

If nginx in front of gitlab-pages then decides to compress the response, it will append another Content-Encoding: gzip header, leading to two conflicting headers in the response returned to the web browser. Chromium-based browsers are confused by this, trust the first empty Content-Encoding header and return gzip-compressed data as-is to the user.

This is a regression in commit 0ba2f08c, present in GitLab 18.4.0 or newer.

Detailed info

When Pages is configured on a GitLab instance, gitlab-pages is used to serve certain artifact types, such as HTML or plain text files, acting as a proxy:

web browser <-(1)-> nginx <-(2)-> gitlab-pages <-(3)-> nginx <-(4)-> rails/workhorse

I numbered the different request stages/steps, as I'll be referencing them below. The above is true for Omnibus-based installations at least, which is what we're running.

Since commit 0ba2f08c (part of GitLab 18.4.0 or newer), gitlab-pages started proxying the Content-Encoding header from upstream. The incoming Content-Encoding header would be gzip if nginx at stage (3) decided to compress the response, otherwise it would be absent. The default config for Rails/Workhorse (gitlab-http.conf) is as follows:

gzip on;
gzip_static on;
gzip_comp_level 2;
gzip_http_version 1.1;
gzip_vary on;
gzip_disable "msie6";
gzip_min_length 250;
gzip_proxied no-cache no-store private expired auth;
gzip_types text/plain text/css application/x-javascript text/xml application/xml application/xml+rss text/javascript application/json;

Most importantly, this defines specific MIME types that are eligible for compression, and the minimum compressible length of 250 bytes.

When the upstream response wasn't compressed, there is no Content-Encoding header present in it. In this case gitlab-pages artificially adds an empty Content-Encoding header (below is request-response for (2)):

GET /-/artifacts/-/jobs/66/artifacts/out/index.html HTTP/1.1
Host: demo.gitlab.pages
X-Real-IP: 172.28.0.1
X-Forwarded-For: 172.28.0.1
X-Forwarded-Proto: http
Connection: close
User-Agent: curl/8.16.0
Accept: */*
Accept-Encoding: deflate, gzip, br, zstd

HTTP/1.1 200 OK
Cache-Control: max-age=3600
Content-Encoding: 
Content-Length: 160
Content-Type: text/html; charset=utf-8
Vary: Origin
X-Request-Id: 01K7EMGSYTAP5RKB9JPVXYSVVM
Date: Mon, 13 Oct 2025 11:15:34 GMT
Connection: close

This is incorrect behaviour, however on its own it may not produce any noticeable negative effects.

However, if client-facing nginx at stage (1) then decides to then compress the response towards the web browser, it appends another Content-Encoding header at the end:

GET /-/artifacts/-/jobs/66/artifacts/out/index.html HTTP/1.1
Host: demo.gitlab.pages
User-Agent: curl/8.16.0
Accept: */*
Accept-Encoding: deflate, gzip, br, zstd

HTTP/1.1 200 OK
Server: nginx
Date: Mon, 13 Oct 2025 11:15:34 GMT
Content-Type: text/html; charset=utf-8
Transfer-Encoding: chunked
Connection: keep-alive
Cache-Control: max-age=3600
Content-Encoding: 
Vary: Origin
X-Request-Id: 01K7EMGSYTAP5RKB9JPVXYSVVM
Strict-Transport-Security: max-age=63072000  
Content-Encoding: gzip

gitlab-pages.conf nginx configuration doesn't define any of its own gzip options, thus it inherits the default global gzip configuration from nginx.conf:

gzip on;
gzip_http_version 1.1;
gzip_comp_level 2;
gzip_proxied no-cache no-store private expired auth;
gzip_types text/plain text/css application/x-javascript text/xml application/xml application/xml+rss text/javascript application/json;

Note that the global configuration doesn't enforce any minimum content length to be compressed, so all responses with the above MIME types will be compressed. In particular, all responses under 250 bytes (and with matching MIME type) will be compressed at stage (1).

In the example above, I created an index.html artifact that is 160 bytes long. It doesn't get compressed at stage (3), but does get compressed at stage (1).

Having two Content-Encoding headers results in undefined behaviour. In particular, Chromium-based browsers seem to trust the first header and return the gzip-compressed data to the user, while Firefox and curl trust the last header:

Screenshot of HTML artifact preview in Edge, Firefox and curl

Microsoft Edge at the top, Firefox in the middle, curl at the bottom.

Steps to reproduce

  1. Install GitLab 18.4.2 either as a Debian package or use the official Docker image.

  2. Configure a GitLab Runner capable of running container images, e.g. of type docker.

  3. Configure GitLab Pages, e.g. for domain gitlab.pages.

  4. Create a project with the following .gitlab-ci.yml contents:

    Demo:
      stage: build
      image: alpine:latest
      script:
        - mkdir out; cd out
    
        # 45 bytes (44 chars + newline)
        - echo 'The quick brown fox jumps over the lazy dog.' > short.txt
    
        # 270 bytes (269 chars + newline)
        - echo 'The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.' > long.txt
    
        # 160 bytes (159 chars + newline)
        - echo '<!doctype html><html><head><meta charset="utf-8"/><title>Foo</title></head><body><h1>Foo!</h1><p>The quick brown fox jumps over the lazy dog.</p></body></html>' > index.html
    
        # 295 bytes (294 chars + newline)
        - echo '<!doctype html><html><head><meta charset="utf-8"/><title>Foo</title></head><body><h1>Foo!</h1><p>The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.</p></body></html>' > index_long.html
      artifacts:
        paths:
          - out
  5. Open the generated out/index.html artifact in a Chromium-based web browser, such as Google Chrome or Microsoft Edge.

  6. View the HTTP response headers for the above request, either using browser's developer tools, or by making a separate request with curl (something like curl --compressed -i http://demo.gitlab.pages/-/artifacts/-/jobs/65/artifacts/out/index.html).

Expected results

When opened in a Chromium-based web browser, a valid HTML document is presented.

When viewing the HTTP response headers, only one Content-Encoding header is present, and that header has a non-empty value.

Actual results

When opened in a Chromium-based web browser, the contents of the page are gibberish, rather than the expected HTML document.

When viewing the HTTP response headers, there are two Content-Encoding headers: one empty, and one gzip.

Culprit and suggested fix

I pinpointed the incorrect behaviour to the following line in internal/artifact/artifact.go:

w.Header().Set("Content-Encoding", resp.Header.Get("Content-Encoding"))

This unconditionally sets the Content-Encoding header regardless of whether it's present in resp or not. Header.Get returns an empty string even if the specified header is absent, see Go http documentation.

The simplest fix is to only set the header if its value is non-empty:

diff --git a/internal/artifact/artifact.go b/internal/artifact/artifact.go
index 86ea681fa2e0..57fed5f63c62 100644
--- a/internal/artifact/artifact.go
+++ b/internal/artifact/artifact.go
@@ -137,7 +137,11 @@ func (a *Artifact) makeRequest(w http.ResponseWriter, r *http.Request, reqURL *u
    }
 
    w.Header().Set("Content-Type", resp.Header.Get("Content-Type"))
-   w.Header().Set("Content-Encoding", resp.Header.Get("Content-Encoding"))
+
+   ce := resp.Header.Get("Content-Encoding")
+   if (ce != "") {
+       w.Header().Set("Content-Encoding", ce)
+   }
 
    // If the API uses chunked encoding, it will omit the Content-Length Header.
    // Go uses a value of -1, which is an invalid value.

Note that the test suite needs to be fixed as well to be more strict about checking absent vs empty headers. Also, I see some more unconditional Header().Set(...) calls, those likely need to be refactored too (e.g. Content-Type set operation right above).

Note on MIME types

We originally encountered this bug with short plain text files on our production system based on the GitLab Debian package. When I tried to reproduce the same bug locally on a GitLab Docker image, the text file artifact would return with application/octet-stream MIME type instead of the expected text/plain.

In the chain above, Workhorse is the one responsible for determining the MIME type of the returned artifact. It uses the Go mime standard library package, which comes with its own small MIME database.

In Go 1.24.5 in particular (used to build GitLab 18.4.2) this database is very small and doesn't include text/plain files. However, Go's mime can also look at the system-level MIME databases, if they are present.

As it turned out, we have two matching MIME databases on our production server, while the official GitLab Docker image has none. This resulted in the discrepancy in the returned data MIME types.

A future Go version will include a more comprehensive MIME database out of the box. However it could be of interest to include a container-wide MIME database in the GitLab Docker image to make it behave more like a non-containerised version.

Short on root causes

  1. Neither gitlab-pages itself nor the test suite differentiate an empty header vs an absent header. This distinction must be made in the code and tested for.
  2. MIME type handling occurs in multiple places, and may behave differently depending on the environment. nginx has clauses that depend on MIME types, and upstream servers may return different MIME types depending on the environment.

Environment

Tested on GitLab EE 18.4.2 with Debian/Ubuntu package as well as the official Docker image.

Microsoft Edge 141.0.3537.71 used as the web browser. This was confirmed by a colleague to also happen in Chrome, but I was given no version information.