On-Call Handover 2024-05-23 07:00 UTC
On-Call Handover
Brought to you by woodhouse
- EOC egress: @pguinoiseau
- EOC ingress: @ahmadsherif
- IM egress: @engwan
- IM ingress: @engwan
- CMOC egress: sshaik supportops
- CMOC ingress: sshaik supportops hmaraszek
Previous on-call issue: #4972 (closed) - On-Call Handover 2024-05-22 23:00 UTC
📖 Summary:
What (if any) time-critical work is being handed over?
What contextual info may be useful for the next few on-call shifts?
🔴 Ongoing alerts/incidents:
GitLab
- production#18050 (closed) - severity3 2024-05-22: Multiple customers reporting jobs hanging on private k8s runners in canceling state
- production#18032 (closed) - severity4 2024-05-18: ExtPvsServiceRunwayIngressErrorSLOViolation
- production#17980 (closed) - severity3 2024-05-08: KubeServiceClusterScaleupsErrorSLOViolation
✅ Resolved alerts/incidents:
GitLab
- production#18052 (closed) - severity3 2024-05-23: Large amount of Sidekiq Queued jobs in the elasticsearch queue
- production#18051 (closed) - severity3 2024-05-23: Missing production metrics for over 30 minutes in production
🔓 Change issues:
In Progress
- production#17947 (closed) - 2024-05-22: [CR] [gprd] Rotate Certificate Authority for GKE/Kubernetes Cluster