Migrate etag caching to redis-cluster-cache
etag cache takes up a sizeable amount of IO on redis persistent as highlighted in #2155 (comment 1470006052). The keys have a 20-minute TTL and might be compatible with a cache-based Redis.
Could it be migrated into redis-cluster-cache
to help buy more headroom for https://gitlab.com/gitlab-com/gl-infra/capacity-planning/-/issues/1061?
Consideration:
- The original idea was to terminate conditional requests in workhorse instead of rails. But moving it to
redis-cluster-cache
will delay that effort until gitlab-org/gitlab#413151 (closed) is done. - Assuming we wish to execute on point (1) it is a potentially breaking change for SM users since the workhorse now needs to connect to 2 different Redises (shared state for pubsub and cache for etags). At the very least, rollout will not be trivial (perhaps a window of etag cache misses when switching between stores).
Outcome
We will perform etag cache migration out of SharedState
to Cache
to take advantage of the larger headroom in redis-cluster-cache
. We will not let consideration (2) overly hold us back since that is not something that is on the roadmap in the near-term. Furthermore, etag can be cached at other levels like CDN in the future.
For now, this gives us an impactful quick win in saturation prevention and moves the workload to a datastore that is more appropriate for its ephemeral keys.