You need to sign in or sign up before continuing.
Redis service start on boot
Summary
During incident production#8201 (closed) we noticed that both Redis and Sentinel are not automatically started after a reboot. This led to quorum loss and service downtime.
One of the replicas wasn't working since 29/09/22, when the primary rebooted due to GCP host failure, there was no possible quorum. We would have recovered in a few minutes, if the service had started on boot, however that was not the case.
We want to confirm that:
- Replica/Primary recover from a normal reboot
- Replica/Primary recover from an instance reset (similar to the incident event)
production#8201 (comment 1228518701)
Related Incident(s)
Originating issue(s): production#8201 (closed)
Desired Outcome/Acceptance Criteria
Redis and Sentinel services recover on a node reboot (soft or hard).
Associated Services
Corrective Action Issue Checklist
-
Link the incident(s) this corrective action arose out of -
Give context for what problem this corrective action is trying to prevent from re-occurring -
Assign a severity label (this is the highest sev of related incidents, defaults to 'severity::4') -
Assign a priority (this will default to 'Reliability::P4')