Capacity Planning: redis-sidekiq service, redis_primary_cpu is trending towards saturation around April 2021
Disclaimer: this forecast is based on a relatively small sample of data, but opening this issue early as a starting point. A due date has been set on this issue to serve as a reminder to recheck this issue after another month of data has been obtained.
On GitLab.com, Sidekiq relies on a dedicated Redis instance for handling messaging queuing and other purposes. This Redis instance is limited to a single core.
We monitor utilization of this resource (named redis_primary_cpu in our monitoring infrastructure): https://dashboards.gitlab.net/d/alerts-saturation_component/alerts-saturation-component-alert?orgId=1&var-environment=gprd&var-type=redis-sidekiq&var-stage=main&var-component=redis_primary_cpu&from=now-45d&to=now
Our monitoring shows that the 95th quantile value for this resource has grown from 48% to 65% over the past 45 days. The 99th quantile has grown from 53% to 71% in the same period.
This growth is mirrored in Tamland's forecasts: https://gitlab-com.gitlab.io/gl-infra/tamland/saturation.html, with may even be a bit optimistic:
Extending the Tamland forecast forward, it predicts that at current growth trends, we'll saturate CPU on Redis Sidekiq by April 2021.
Scaling Options
We may be able to gain a bit of additional headroom by upgrading to Redis 6 and utilizing the io-threads functionality it provides. It's possible that a new, more powerful machine on GCP before this date, to squeeze out from extra cycles too, but I wouldn't rely on this.
An alternative option would be to shard Redis-Sidekiq into multiple Redis instances, each handling a portion of the Sidekiq traffic. This would be a lot of work from an application development and testing point of view, but, if done right, would allow us to continue scaling Redis-Sidekiq up beyond a handful of nodes.

