Increase Redis Sentinel maxclients limit on redis-sidekiq

Background

We discovered in production#17468 (comment 1741766338) that we're getting ERR max number of clients reached responses from Redis Sentinel a few times a day:

image

source

We initially thought these were coming from the redis process in redis-sidekiq (maxclients 50k) but we discovered in #2867 (comment 1774133043) that it is in fact the sentinel process (maxclients 10k).

Impact

This is producing short bursts of user facing errors. While it is not yet impacting SLOs, this is a very worrying development because:

  • We are saturating a critical resource, which could get much worse and have cascading effects.
  • Our monitoring does not detect this saturation (we have no redis_exporter for sentinel).
  • Our capacity planning did not forecast this saturation.

Proposal

  • We should bump maxconns to 50k for Redis Sentinel on redis-sidekiq (and possibly all other sentinels). 👉 #2754 (comment 1928247689)
    • It may be possible to do this without any omnibus changes (source). We need to verify that.
    • Omnibus changes would greatly increase the scope, see #2878 (moved).
  • We should introduce redis_exporter for sentinel 👉 #2754 (comment 2050461872)
Edited by Furhan Shabir