Setup metrics, monitoring, dashboards, alerts
- Not sure yet how much of the existing tooling we can reuse. Certainly some, but not all.
- May need to extend redis_exporter, if it does not already support harvesting cluster-specific metrics.
- Definitely need dashboard additions. Examples: We will want new dashboard sections to visualize the cluster-wide state, such as graphs showing metrics for all N masters. It may also be useful to make a dashboard for zooming in to the metrics for just a single shard's nodes (1 master and replicas). May want a graph showing the number of healthy/failed replicas following each master (especially if we allow the cluster to assign a variable number of replicas to each master).
See dashboard for redis-cluster-ratelimiting
Tasks
1. Add
2. Add cluster-specific recording rules (if needed)
redis-cluster-ratelimiting service into runbooks3. Create standard Redis dashboard with cluster-specific information
4. Enable alert rules
Edited by Sylvester Chin