Correctly set cache headers for files served from object storage
Summary
Per https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/7942#note_221058899
When the UploadsController
serves a file, it sets Cache-Control
headers that it expects the client to see. However, when object storage is enabled, we use the sendurl
, rather than sendfile
, function of gitlab-workhorse (to stream the bytes from the object store). This uses the Cache-Control
headers sent by the object store, rather than those sent by Rails.
Steps to reproduce
Visit, e.g., https://gitlab.com/uploads/-/system/user/avatar/1149402/avatar.png?width=23 - note the Cache-Control
header
What is the current bug behavior?
Rails returns a public, max-age=...
header which is ignored. Google Cloud (the object store) returns a private, max-age=0
header which is sent to the client.
What is the expected correct behavior?
The Rails Cache-Control
header should be sent instead.
Output of checks
This bug happens on GitLab.com
Possible fixes
In Workhorse, the offending line is https://gitlab.com/gitlab-org/gitlab-workhorse/blob/master/internal/sendurl/sendurl.go#L143 .
We could modify workhorse to merge certain headers from Rails into the response it sends back unconditionally. This might be OK, but we need to audit all users of send-url
in Rails and make sure it will be. There might be cases where it's inappropriate.
If any of those exist, the simplest option is probably going to be to add an override-headers
setting of some kind into the send-data:
payload we send to workhorse. This can contain a list of header names that it should take from the Rails response instead of from the upstream response. This would require changes to both rails and workhorse.