Weekly Reliability (SRE) Team Newsletter – On-call Period: 2022-11-22 - 2022-11-29
Announcements
Engineering Week in Review Highlights:
Team Updates
On-Call During This Period
| Schedule | Username |
|---|---|
| SRE 8-hour Americas | Nels Nelson |
| SRE 8-hour Americas | Stephanie Jackson |
| SRE 8-hour APAC | Furhan Shabir |
| SRE 8-hour APAC | Gonzalo Servat |
| SRE 8-hour EMEA | Ahmad Sherif |
| SRE 8-hour EMEA | Steve Azzopardi |
PagerDuty Incidents
See the 1 week report for acknowledged PD pages (long-term trend)
Alerts Volume
7 Day Issue Stats
- Oncall issues : 0
- Access Request : 0
- Change Issues : 4
- Incident Issues : 6
- CorrectiveAction Issues : 0
Change Issues
- 2022-11-22T19:07:27Z - Rebuild Staging Zonal Cluster us-east1-c (production#8083 - closed)
- 2022-11-22T18:54:06Z - Reindex top 25 indexes on registry DB with the ... (production#8082 - closed)
- 2022-11-22T17:41:02Z - 2022-12-23 (C2): Automate dev.gitlab.org and re... (production#8081 - closed)
- 2022-11-22T01:10:31Z - [GRPD] Pull an authentication event (production#8079 - closed)
Incident Issues
- 2022-11-27T23:59:05Z - 2022-11-27: customers.gitlab.com is down (production#8088 - closed) | reliability~3760140 | ~"Service::Customers" |
https://gitlab.com/gitlab-com/gl-infra/production/-/issues/8088 - 2022-11-26T17:39:10Z - 2022-11-26: WebServiceLoadbalancerErrorSLOViola... (production#8087 - closed) | reliability~3760141 | ServiceWeb |
https://gitlab.com/gitlab-com/gl-infra/production/-/issues/8087 - 2022-11-25T09:49:46Z - 2022-11-25: Gitaly Apdex dropped for file-61-st... (production#8086 - closed) | reliability~3760142 | ServiceGitaly |
https://gitlab.com/gitlab-com/gl-infra/production/-/issues/8086 - 2022-11-25T06:30:23Z - 2022-11-25: customer.gitlab.com unavailable (production#8085 - closed) | reliability~3760140 | ~"Service::Customers" |
https://gitlab.com/gitlab-com/gl-infra/production/-/issues/8085 - 2022-11-23T02:11:40Z - 2022-11-23: Grafana dashboards aren't showing m... (production#8084 - closed) | reliability~3760141 | ServiceOncall-Tooling |
https://gitlab.com/gitlab-com/gl-infra/production/-/issues/8084 - 2022-11-21T19:29:21Z - 2022-11-21: The Disk Space Utilization per Devi... (production#8078 - closed) | reliability~3760141 | ServicePatroni |
https://gitlab.com/gitlab-com/gl-infra/production/-/issues/8078
CorrectiveAction Issues
Open Issue Stats
- Oncall issues : 2
- Change issues : 22
- Incident issues : 8
- Access Request : 0
- CorrectiveAction : 88
Open Change Issues
Show/Hide Table
Open Incident Issues
Show/Hide Table
| Created | Summary |
|---|---|
| 2022-11-02T14:18:07Z | 2022-11-02: Intermittent Internal API unreachable (production#7979 - closed) |
| 2022-10-25T20:05:24Z | 2022-10-25: Intermittent kas.gitlab.com timeouts (production#7924 - closed) |
Open Oncall Issues
Show/Hide Table
| Created | Summary |
|---|---|
| 2021-09-17T19:35:34Z | https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/14205+ |
| 2020-12-18T22:29:14Z | https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/12200+ |
Issues for Review during Incident Review Meeting
If there are any incidents you think would be good to review, please add them to the Agenda for the next meeting.
Edited by Anthony Fappiano