Weekly Reliability (SRE) Team Newsletter – On-call Period: 2023-01-24 - 2023-01-31

Announcements

Engineering Week in Review Highlights:

Team Updates


On-Call During This Period

Schedule Username
SRE 8-hour Americas Cindy Pallares
SRE 8-hour Americas Cameron McFarland
SRE 8-hour Americas Marcel Chacon
SRE 8-hour APAC Devin Sylva
SRE 8-hour EMEA Ahmad Sherif
SRE 8-hour EMEA Steve Azzopardi

PagerDuty Incidents

See the 1 week report for acknowledged PD pages (long-term trend)

Alerts Volume

7 Day Issue Stats

  • Oncall issues : 0
  • Access Request : 0
  • Change Issues : 16
  • Incident Issues : 19
  • CorrectiveAction Issues : 0

Change Issues

Incident Issues

CorrectiveAction Issues

Open Issue Stats

Open Change Issues

Show/Hide Table
Created Summary
2023-01-27T19:43:52Z 2023-01-27: [STAGING] Improve caching policy in... (production#8310 - closed)
2023-01-27T16:50:11Z 2023-01-31: Functionally shard redis-repository... (production#8309 - closed)
2023-01-26T19:19:21Z 2023-02-03: Thanos compactor migration (production#8303 - closed)
2023-01-25T14:24:03Z https://gitlab.com/gitlab-com/gl-infra/production/-/issues/8291+
2023-01-25T13:55:33Z [GPRD] Execute the logical replication test in ... (production#8290 - closed)
2023-01-19T14:51:06Z 2023-02-07: Removing PREVENT_LOAD_BALANCER_RETR... (production#8263 - moved)
2023-01-10T07:17:05Z https://gitlab.com/gitlab-com/gl-infra/production/-/issues/8225+
2022-12-22T03:57:59Z 2022-12-22: Add registry expiration policies to... (production#8186 - closed)
2022-12-01T00:14:18Z Enable Index Lifecycle Management for Advanced ... (production#8113 - closed)
2022-11-14T23:01:34Z Change the schedule of automatic database reind... (production#8048 - closed)
2022-10-18T07:03:19Z 2022-10-18: [GPRD] Use turbo mode on restore co... (production#7893 - closed)
2022-10-18T06:45:39Z 2022-10-18: [GPRD] Use turbo mode on restore co... (production#7892 - closed)
2022-10-07T22:14:21Z Delete the production gemnasium Cloud SQL datab... (production#7848 - closed)
2022-10-06T07:05:45Z Disable fastupdate on merge_requests GIN indexes (production#7840 - closed)
2022-10-05T00:00:51Z Reduce io-threads on redis-persistent in staging (production#7834 - closed)
2022-09-02T09:37:55Z Removal of sigin-Page text for staging and .com (production#7681 - closed)
2022-08-25T14:11:48Z [GPRD] Enable pg_wait_sampling in Production (production#7653 - closed)
2022-08-17T19:40:56Z Move version.gitlab.com to new namespace, GCP p... (production#7615 - closed)
2022-08-16T16:43:58Z 2022-08-16: Increase max_client_conn for pgboun... (production#7607 - closed)
2022-07-22T14:23:23Z 2022-07-22: Update home page url in staging fro... (production#7497 - closed)
2022-06-06T15:55:01Z Rollout usage of ci-gateway ILB for on shared r... (production#7206 - closed)
2022-05-23T16:22:23Z 2022-05-23: Update auth scopes to include servi... (production#7117 - closed)

Open Incident Issues

Show/Hide Table
Created Summary
2023-01-28T18:54:29Z 2023-01-28: Thanos Compactor finding duplicate ... (production#8314 - closed)
2023-01-27T21:22:17Z 2023-01-27: PatroniServiceRailsPrimarySqlApdexS... (production#8311 - closed)
2023-01-26T08:40:08Z 2023-01-26: SSL certificate for pages.gitlab.io... (production#8296 - closed)
2023-01-24T17:18:04Z 2023-01-24: Chef client has been disabled for a... (production#8287 - closed)

Open Oncall Issues

Show/Hide Table
Created Summary
2021-09-17T19:35:34Z https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/14205+
2020-12-18T22:29:14Z https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/12200+

Issues for Review during Incident Review Meeting

If there are any incidents you think would be good to review, please add them to the Agenda for the next meeting.
Edited by ops-gitlab-net