[Proposal] Better Etag Support for Resources
Resources
Description
This issue is a proposal for implementing a conditional HTTP caching scheme with very low cost cache misses. This was originally discussed as part of the https://gitlab.com/gitlab-org/gitlab-ce/issues/26396.
Using ETag Caching
Background
It is assumed that the reader is familiar with the following concepts.
Assumptions
- ETag caching works best with RESTful resources: API's which map resources using the standard RESTful pattern (
GET /
,GET /:id
,POST /
, etc) - (Related to 1) ETag caching should ideally only be used with HTTP GET calls. Although the HTTP specification does allow caching to be implemented on HTTP POST methods, it is not widely used and can be confusing.
- (Related to 1 & 2) ETag caching should only be used with read-only operations.
-
Any two HTTP endpoints with the same URL refer to the same resource. This solution will not work for HTTP endpoints where multiple resources are represented by the same URL. For example, the GitLab API's Get Current User
GET /user
would not be able to use the proposed solution, since the endpoint implies the authenticated user and therefore refers to multiple user resources under a single HTTP endpoint.
Why use Conditional HTTP Requests?
Currently, GitLab is using HTTP polling to check for modifications to resources. While an ideal solution to this problem would use some sort of push technology (Web sockets, HTTP Long Polling, Bayeux, etc) to avoid polling altogether, an interim solution would be to make cache hits (i.e., an HTTP poll where the server-side resource has NOT changed) very cheap and thereby scalable.
At present, cache hits and cache misses occur the same performance and server resource penalties.
Built in Rails Support for ETag Caching
Rack and Rails have moderately good builtin support for caching via the Rack::ETag
and Rack::ConditionalGet
middlewares.
More reading:
-
Take Control of Your HTTP Caching in Rails - Jan 2015
-
How key-based cache expiration works - DHH, Feb 2012
The problem with the built in Rack ETag support is that it works by generating an MD5 checksum of the string body of the response. This means that in order to compare the If-None-Match
request header to the current ETag
, the full response needs to be generated, hashed and then compared to the current value.
Using the default Rack/Rails ETag support, a Cache Hit / 304 Not Modified
responses will:
- Take the same amount of time responding to a cache hit and a cache miss
❌ - Take the same amount of server resources responding to a cache hit and a cache miss
❌ - Take less network resources for a cache hit than a cache miss
✅
The default solution gives us 1 out of 3 advantages over no caching. An optimal solution would give use all three.
How could we implement this?
First Iteration Proposal: A Rails Only Solution
As a first step, keeping the solution in a single package would allow us to move fast. Once we're confident that it works well, we could further improve it by utilising workhorse to check for cache hits, skipping Ruby altogether. But for now, let's focus on the first iteration.
For the rest of this proposal, I'll be using the GitLab Pipelines API in my examples. Refer to the documentation for more information on this API.
Until my Ruby, Rails and Grape knowledge improves, I'll use pseudocode rather than Ruby (sorry about this!). This solution could probably be more elegantly implemented using a mixin, middleware or some other mechanism. What I'm trying to focus on here is the process, not the implementation.
We'll use a class with several static methods:
ConditionalResources::is_none_match_valid_for_resource(request, response)
ConditionalResources::set_last_update_for_resource(request, response, last_updated)
ConditionalResources::invalidate(paths)
To add conditional caching to a route, we check whether the client has presented a valid If-None-Match
header, and if so, early return an HTTP 304 Not Modified
.
if ConditionalResources::is_none_match_valid_for_resource(request, response) then
# Cache Hit, return early
render :nothing => true, :status => 304
return
end
is_none_match_valid_for_resource
does the following:
- If
request
does not include anIf-None-Match
returnFalse
immediately. - If a
If-None-Match
header has been presented, get the resource path from the request, eg/projects/5/pipelines
and use this to lookup Redis string for this pathetag:/projects/5/pipelines
. - Iff the Redis key exists AND the value matches the header, return
True
On cache miss, the route proceeds as normal, but before returning, a call to set_last_update_for_resource
must be made:
ConditionalResources::set_last_update_for_resource(request, response, last_updated)
set_last_update_for_resource
works by:
- Generating an MD5 checksum using the provided
last_update
date. - Sets the response
ETag
header toW/#{md5}
- Sets the Redis key (
etag:/projects/5/pipelines
) to the Etag header. To prevent Redis from filling up withetag:*
keys, a suitable TTL value would be used on the key.
Finally, paths should be invalidated on model changes. This is where the invalidate
method is used.
When a pipeline model is changed, it would use the after_save
callback to invalidate any associated paths.
ConditionalResources::invalidate([
"/projects/#{projectId}/pipelines",
"/projects/#{projectId}/pipelines/#{pipelineId}"])
invalidate
works by deleting any associated etag
keys in Redis. In this example, it would execute the following command in Redis:
DEL etag:/projects/5/pipelines etag:/projects/5/pipelines/10
The result of this is that any future calls to those endpoints would force a cache miss (since the etag value has been removed from Redis)
Whiteboard Sketches
Cache Miss
Cache Hit
Performance Costs
- A cache hit will cost one Redis call. In future this check could easily be migrated to workhorse for zero Ruby cost on cache hit.
- A cache miss will cost two additional Redis calls (one for the
GET
at the beginning and one for theSETNX
at the end)