[Research] Shard Redis caching by GitLab Group/Stage/Feature

This idea comes from discussion with @smcgivern. See https://docs.google.com/document/d/1woAADnZPR_8YNUUT5CVByvRNO4rgzO789Dcuu6LdsiE/edit for more details.

We appear to be nearing the limit of how much we can vertically scale a single Redis instance on GitLab.com.

There are many things we can do to optimise our Redis usage to prolong the time until we reach this limit, but we should probably start preparing a strategy now to mitigate this when the time arrives.

One approach, which @smcgivern has mentioned he favours, would be to shard our Redis usage by business area / devops stage (ie, plan, create, etc).

On a small GitLab instance, these would all point to a single Redis instance, but on GitLab.com, we could point them to different instances.

One of the advantages of this approach is that it gives each group more ownership of their Redis. Put bluntly, each group will be more directly responsible for the way they treat their Redis instance.

Another advantage is that regressions should be easier to track back to the group that caused the problem, while still retaining the monolith architecture.

Edited Jul 19, 2019 by Liam McAndrew

Admin message

[Research] Shard Redis caching by GitLab Group/Stage/Feature