Design a model for scaling redis-sidekiq
Define a proper model that keeps the components functioning as they are. The model must be compatible with all the components listed in the above question. One example, a share-nothing instance-per-queue model is likely to be conflicted with Sidekiq's default ScheduledSet flow. We should either pick another model, or re-implement Sidekiq's scheduler to adapt to the model. We discuss some approaches here.
Conclusion
This issue compares between two possible models:
- Sidekiq Functional Sharding - Redis instance per queue and a common instance. This approach attempts to split the traffic from the client side. The routing logic is implemented in the application layer. It establishes a map between queue and redis instance. All jobs belong to a queue go to a per-queue Redis instance. We also need to keep a global Redis instance across all shards. Because this model depends on the internal architecture of Sidekiq and how our home-grown components work in detail, it is fragile, expensive, and complicated. The detailed analysis locates at this comment
- Zonal sidekiq cluster. This model shards redis at the infrastructure layer. It creates multiple Sidekiq clusters, isolated from each other. Each cluster has its own Redis instance. Any operations accessing multiple Redis instances must go through a federation layer. This layer also helps us avoid modifying a major of components. The model is not necessary tight-coupled to to Kubernetes, but Kubernetes is a good implementation of this model. That said, we can map one Sidekiq cluster per Kubernetes availability zone. The detailed explanation can be found at this comment.
It's quite clear that the first model is a painful journey while it does not yield any advantages over the second one. Therefore, we decided to move forward with Zonal Sidekiq Cluster.
Edited by Quang-Minh Nguyen