Skip to content

Correctly set cache headers for files served from object storage

Summary

Per https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/7942#note_221058899

When the UploadsController serves a file, it sets Cache-Control headers that it expects the client to see. However, when object storage is enabled, we use the sendurl, rather than sendfile, function of gitlab-workhorse (to stream the bytes from the object store). This uses the Cache-Control headers sent by the object store, rather than those sent by Rails.

Steps to reproduce

Visit, e.g., https://gitlab.com/uploads/-/system/user/avatar/1149402/avatar.png?width=23 - note the Cache-Control header

What is the current bug behavior?

Rails returns a public, max-age=... header which is ignored. Google Cloud (the object store) returns a private, max-age=0 header which is sent to the client.

What is the expected correct behavior?

The Rails Cache-Control header should be sent instead.

Output of checks

This bug happens on GitLab.com

Possible fixes

In Workhorse, the offending line is https://gitlab.com/gitlab-org/gitlab-workhorse/blob/master/internal/sendurl/sendurl.go#L143 .

We could modify workhorse to merge certain headers from Rails into the response it sends back unconditionally. This might be OK, but we need to audit all users of send-url in Rails and make sure it will be. There might be cases where it's inappropriate.

If any of those exist, the simplest option is probably going to be to add an override-headers setting of some kind into the send-data: payload we send to workhorse. This can contain a list of header names that it should take from the Rails response instead of from the upstream response. This would require changes to both rails and workhorse.