Etag for project exports is always the same, corrupting caches

Summary

When downloading a project export from https://gitlab.com/api/v4/projects/:id/export/download, the server returns always the same Etag value. When using this value on later requests with If-None-Match the server will respond with 304 Not Modified although the file has been modified. As far as I understand HTTP caching, this is wrong behavior and it can lead to a client using wrong (outdated) data as project export.

Steps to reproduce

  1. Generate export for a project
  2. Download export file and look at header: curl -D headers --header "PRIVATE-TOKEN: :token" https://gitlab.com/api/v4/projects/:id/export/download > /dev/null; cat headers
  3. Change something in the project
  4. Generate new export for the project
  5. Download export file if changed, using Etag value from previous request: curl -D headers --header 'If-None-Match: :etag' --header "PRIVATE-TOKEN: :token" https://gitlab.com/api/v4/projects/:id/export/download > /dev/null; cat headers
  6. Alternatively, download export file and compare Etag header with previous download: curl -D headers --header "PRIVATE-TOKEN: :token" https://gitlab.com/api/v4/projects/:id/export/download > /dev/null; cat headers

Example Project

Occurs with all of my projects on GitLab.com.

What is the current bug behavior?

The request with If-None-Match will not download the project export (although it has changed). When downloading the file again and comparing the headers, the Etag values will be the same. In fact, I see the same constant Etag values for exports across all of my projects (and over a time period of several weeks), though it seems the value depends on the downloading user.

What is the expected correct behavior?

The server should not use the same Etag value for semantically different files downloaded from the same URL (and following from this If-None-Match with an old Etag should not lead to an export not being downloaded again if it has changed).

Relevant logs and/or screenshots

Headers from first download of project
HTTP/1.1 200 OK
Server: nginx
Date: Tue, 10 Jul 2018 05:09:58 GMT
Content-Type: application/gzip
Content-Length: 115153
Accept-Ranges: bytes
Cache-Control: max-age=0, private, must-revalidate
Content-Disposition: attachment; filename=2018-07-09_22-02-117_<PROJECT>_export.tar.gz
Content-Transfer-Encoding: binary
Etag: W/"37a6259cc0c1dae<TRUNCATED>"
Last-Modified: Mon, 09 Jul 2018 22:02:10 GMT
Vary: Origin
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-Request-Id: 0210e059-f22a-465e-8620-5dd643f48ddb
X-Runtime: 0.069301
Strict-Transport-Security: max-age=31536000
RateLimit-Limit: 600
RateLimit-Observed: 3
RateLimit-Remaining: 597
RateLimit-Reset: 1531199458
RateLimit-ResetTime: Wed, 10 Jul 2018 05:10:58 GMT

I truncated the Etag, because as mentioned above it stays constant for a long time but is different for each user, so I want to make sure this does not leak confidential information.

Headers from requests with `If-None-Match: W/"37a6259cc0c1dae"`
HTTP/1.1 304 Not Modified
Server: nginx
Date: Tue, 10 Jul 2018 05:28:40 GMT
Cache-Control: max-age=0, private, must-revalidate
Content-Disposition: attachment; filename=2018-07-10_05-21-484_<PROJECT>_export.tar.gz
Content-Transfer-Encoding: binary
Etag: W/"37a6259cc0c1dae<TRUNCATED>"
Vary: Origin
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-Request-Id: 38956c09-9131-4c1f-93d2-b25fb4050e28
X-Runtime: 0.062416
X-Sendfile: /var/opt/gitlab/gitlab-rails/shared/tmp/project_exports/<PROJECT>/2018-07-10_05-21-484_<PROJECT>_export.tar.gz
Strict-Transport-Security: max-age=31536000
RateLimit-Limit: 600
RateLimit-Observed: 2
RateLimit-Remaining: 598
RateLimit-Reset: 1531200580
RateLimit-ResetTime: Wed, 10 Jul 2018 05:29:40 GMT

Note that that this leaks the X-Sendfile header.

Headers from second download of project
HTTP/1.1 200 OK
Server: nginx
Date: Tue, 10 Jul 2018 05:21:11 GMT
Content-Type: application/gzip
Content-Length: 115736
Accept-Ranges: bytes
Cache-Control: max-age=0, private, must-revalidate
Content-Disposition: attachment; filename=2018-07-10_05-21-484_<PROJECT>_export.tar.gz
Content-Transfer-Encoding: binary
Etag: W/"37a6259cc0c1dae<TRUNCATED>"
Last-Modified: Tue, 10 Jul 2018 05:21:06 GMT
Vary: Origin
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-Request-Id: 0287e092-2f16-4772-95c2-12bff3b98b4f
X-Runtime: 0.043006
Strict-Transport-Security: max-age=31536000
RateLimit-Limit: 600
RateLimit-Observed: 1
RateLimit-Remaining: 599
RateLimit-Reset: 1531200131
RateLimit-ResetTime: Wed, 10 Jul 2018 05:22:11 GMT

Note you can see that even the Content-Length differs but Etag is the same.

Output of checks

This bug happens on GitLab.com

Assignee Loading
Time tracking Loading