Weekly Reliability (SRE) Team Newsletter – On-call Period: 2022-09-20 - 2022-09-27

Announcements

Engineering Week in Review Highlights:

Team Updates


On-Call During This Period

Schedule Username
SRE 8-hour Americas Alex Hanselka
SRE 8-hour Americas Matt Smiley
SRE 8-hour APAC Devin Sylva
SRE 8-hour EMEA Ahmad Sherif
SRE 8-hour EMEA Alejandro Rodriguez

PagerDuty Incidents

See the 1 week report for acknowledged PD pages (long-term trend)

Alerts Volume

7 Day Issue Stats

  • Oncall issues : 0
  • Access Request : 0
  • Change Issues : 10
  • Incident Issues : 22
  • CorrectiveAction Issues : 0

Change Issues

Incident Issues

CorrectiveAction Issues

Open Issue Stats

Open Change Issues

Show/Hide Table
Created Summary
2022-09-23T15:35:33Z Fill in DORA configuration for gitlab-org/gitla... (production#7793 - closed)
2022-09-21T15:18:22Z [Production] Disable index_ci_builds_metadata_o... (production#7781 - closed)
2022-09-21T15:16:37Z [Staging] Disable index_ci_builds_metadata_on_b... (production#7780 - closed)
2022-09-21T08:16:10Z 2022-09-21: [GPRD] Use turbo mode on restore co... (production#7776 - closed)
2022-09-21T08:16:05Z Update CA certs configurations for registry DB ... (production#7775 - closed)
2022-09-20T16:16:49Z 2022-10-01: GPRD Truncate the rest of CI tables... (production#7770 - closed)
2022-09-19T06:30:03Z https://gitlab.com/gitlab-com/gl-infra/production/-/issues/7759+
2022-09-19T06:15:09Z https://gitlab.com/gitlab-com/gl-infra/production/-/issues/7758+
2022-09-15T14:50:38Z 2022-10-05: Cleanup Patroni 1604 Clusters and D... (production#7749 - closed)
2022-09-14T03:37:44Z Add CA cert to registry DB in production (production#7732 - closed)
2022-09-02T09:37:55Z Removal of sigin-Page text for staging and .com (production#7681 - closed)
2022-08-25T14:11:48Z [GPRD] Enable pg_wait_sampling in Production (production#7653 - closed)
2022-08-17T19:40:56Z Move version.gitlab.com to new namespace, GCP p... (production#7615 - closed)
2022-08-16T16:43:58Z 2022-08-16: Increase max_client_conn for pgboun... (production#7607 - closed)
2022-08-15T10:36:05Z gstg-us-east1-b Zonal cluster rebuild (production#7598 - closed)
2022-08-12T15:04:02Z 2022-08-12: Opstrace Error Tracking Open Beta R... (production#7586 - closed)
2022-08-01T08:17:13Z 2022-08-16: GPRD Scale down the number of Patro... (production#7531 - closed)
2022-08-01T08:16:04Z https://gitlab.com/gitlab-com/gl-infra/production/-/issues/7530+
2022-07-22T14:23:23Z 2022-07-22: Update home page url in staging fro... (production#7497 - closed)
2022-06-28T16:43:38Z Raise Tag Count Limit for Ongoing Phase 2 conta... (production#7343 - closed)
2022-06-27T14:16:42Z Reset Skipped Container Repository Imports for ... (production#7334 - closed)
2022-06-06T15:55:01Z Rollout usage of ci-gateway ILB for on shared r... (production#7206 - closed)
2022-05-23T16:22:23Z 2022-05-23: Update auth scopes to include servi... (production#7117 - closed)

Open Incident Issues

Show/Hide Table
Created Summary
2022-09-22T22:01:34Z 2022-09-22: Consistient 500 errors when veiwing... (production#7788 - closed)
2022-09-20T17:54:40Z 2022-09-20: prometheus pods continue to restart... (production#7772 - closed)

Open Oncall Issues

Show/Hide Table
Created Summary
2021-09-17T19:35:34Z https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/14205+
2020-12-18T22:29:14Z https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/12200+

Issues for Review during Incident Review Meeting

If there are any incidents you think would be good to review, please add them to the Agenda for the next meeting.
Edited by ops-gitlab-net