Capacity Planning: redis Service, redis_primary_cpu resource (#1572) · Issues · GitLab.com / GitLab Infrastructure Team / Observability / Observability Issue Tracker · GitLab

Capacity Planning: redis Service, redis_primary_cpu resource

CPU on our primary Redis instance is trending up quite steeply ![image](/uploads/75d647e0be504fca08115f6912d75951/image.png) From the Tamland report at https://gitlab-com.gitlab.io/gl-infra/tamland/saturation.html ---------------------------------------------------------------- This is a placeholder for now, but we need to start thinking about next steps for scaling the Redis service (along with Redis-Sidekiq: https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/590, which has its own issue) Possible actions include: 1. Watch and wait to see if things flatten out 1. Investigate the sources of traffic and optimize the application 1. Optimizations to the Redis instance, including upgrading to Redis 6 with `--io-threads` 1. Vertical scaling, if GCP offer more powerful instance types (~~Afaik, I don't think they do have any more powerful Intel machines, and we can't experiment with things like AMD EPYC processors yet, as they're in Beta and don't run in `us-east1` yet~~ should we investigate N2D AMD EPYC nodes?) 1. Break off more bits from the Redis instance. 1. Rack sessions seem like a possible option and would also unblock us from Redis Cluster 1. Address any Redis Cluster key violations in preparation for Redis Cluster 1. Start preparing to Redis Cluster

issue