On-Call Handover 2024-04-18 15:00 UTC
On-Call Handover
Brought to you by woodhouse
- EOC egress: @sxuereb
- EOC ingress: @sarahwalker @mattmi
- IM egress: @mrincon
- IM ingress: @jayswain @cwoolley-gitlab
- CMOC egress: bfreitas hmaraszek
- CMOC ingress: lee cwilliamson bfreitas supportops hmaraszek
Previous on-call issue: #4868 (closed) - On-Call Handover 2024-04-18 07:00 UTC
📖 Summary:
What (if any) time-critical work is being handed over?
What contextual info may be useful for the next few on-call shifts?
🔴 Ongoing alerts/incidents:
GitLab
- production#17859 (closed) - severity4 2024-04-18: GitalyServiceGoserverTrafficCessationSingleNode
PagerDuty
-
https://gitlab.pagerduty.com/incidents/Q3QGFI080VYQ2H - [#88073] firing - Service monitoring (gprd)
-
https://gitlab.pagerduty.com/incidents/Q018ZH5YP7064F - [#88074] firing - Service sd-exporter (gprd)
-
https://gitlab.pagerduty.com/incidents/Q2TSOMB3D0AWVJ - [#88093] firing - Chef client failures have reached critical levels (ops)
✅ Resolved alerts/incidents:
GitLab
- production#17860 (closed) - severity4 2024-04-18: Temporary failures on apt-get update
PagerDuty
-
https://gitlab.pagerduty.com/incidents/Q3YIGSHMD0HEC6 - [#88062] firing - Service api (gprd)
-
https://gitlab.pagerduty.com/incidents/Q1V16TC71BN31R - [#88063] firing - Service web (gprd)
-
https://gitlab.pagerduty.com/incidents/Q1B51IDP9OO70O - [#88065] firing - Service ci-runners (gprd)
-
https://gitlab.pagerduty.com/incidents/Q3HJPIXES9KUUM - [#88080] firing - Service gitaly (gprd)
-
https://gitlab.pagerduty.com/incidents/Q2BBXDOVD7B945 - [#88084] firing - Multiple versions of Gitaly have been running alongside one another (gprd)
🔓 Change issues:
In Progress
- production#17857 (closed) - CR [GSTG] Increase max_wal_size from 16GB to 32GB
- production#17811 (closed) - Change rate limits for cloud.gitlab.com
- production#17783 (closed) - 2024-04-18: Record all running builds instead of only ones running in shared runners
Closed
- https://gitlab.com/gitlab-com/gl-infra/production/-/issues/17776
- https://gitlab.com/gitlab-com/gl-infra/production/-/issues/17763
- production#17600 - 2024-02-15: Upgrade logs-gprd to 8.12.1
- production#17599 (closed) - 2024-02-15: Upgrade nonprod-logs ES cluster to 8.12.1
- production#17597 (closed) - Draft: Test restoration of Gitaly backup for ops.gitlab.net