Skip to content

Improve caching on raw endpoints (Cloudflare rules)

As part of https://gitlab.com/gitlab-com/gl-infra/production/-/issues/2797 we noticed, that caching for the raw endpoint (most-likely also for the archive) endpoint is implemented inefficiently.

There are a few main points:


Optimizing our Cloudflare rules for caching

We need to take some actions on the Caching rules to make this all work though.

In the rules we need to set:

Disable Security # Makes sure a client does not receive a captcha on that endpoint. Firewall rules precede this
Response Buffering: Off # Ensures we stream the response to the client to reduce the time to first byte
Cache Level: Cache Everything # This makes sure, we can cache everything
Query String Sort: On # This ensures the responses with queries are cached regardless of the order of the queries
Respect Strong ETags: On # Without this, we cannot cache range requests
Cache Deception Armor: Off # This is required, because we receive requests with a Cookie. But we decide in the App whether to allow caching of the resource, so this is fine
Origin Cache-Control: On # This enforces the cache settings we specify in the `cache-control` header
Disable Apps # This disables apps installed in Cloudflare
Disable Performance # This disables content transformation (but should keep re-compressing to a gzip/brotli response for example)

Cloudflare states:

Strong ETag headers ensure the resource in browser cache and on the web server are byte-for-byte identical. Domains on Enterprise plans enable strong ETag headers via a Respect Strong ETags Page Rule. Otherwise, strong ETag headers are converted to weak ETag headers. Also, set a strong ETag header in quotes (Etag: "example") or Cloudflare removes the ETag instead of converting it to a weak ETag.

Note! These changes must ONLY be made, after the headers are set correctly in the code, to prevent accidental caching.

Edited by Hendrik Meyer (xLabber)