Skip to content

On-Call Handover 2021-04-29 23:00 UTC

On-Call Handover

Brought to you by the Slack slash command: /sre-oncall handover

Summary:

  • Relentless miner bots.
  • No database failover today, so I got that going for me, which is nice.
  • A new zfs-based patroni node has been introduced to our production database fleet in order to ultimately serve as a low-latency replica for data warehouse. It is replicating only from patroni-08 and should not have any effect on the production main stage database cluster.
  • This dead tuples alert keeps firing, so watch out for all these dead tuples okay? Seriously though, OnGres says nothing to worry about, and they seem correct. Looking at graphs over the past week, everything seems within normal parameters.

What (if any) time-critical work is being handed over?

What contextual info may be useful for the next few on-call shifts?

Ongoing alerts/incidents:

Resolved actionable alerts:

Unactionable alerts:

Resolved production incidents:

Change issues:

Edited by Nels Nelson