Weekly Reliability (SRE) Team Newsletter – On-call Period:2020-12-01 - 2020-12-08
# Announcements
- [GSuite login is being replaced by Okta on 06 Dec 2020](https://gitlab.slack.com/archives/C0259241C/p1606950021192200)
- [Restrict posts to the #incident-management Slack channels](https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/11254) (@AnthonySandoval) I hope to get this merged in on 08 December. Note, and to be helpful, good practice will be for the EOC to add individuals into the directed Slack channels. In the `#incident-management` channel, we'll all still be able to comment in a thread.
- [Increase traffic on canary](https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/9777) appears to have mitigated the issue with noisy alerts (https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3047).
- [Clarifications to Criticality levels for Production Change Issues](https://gitlab.com/gitlab-com/www-gitlab-com/-/merge_requests/69512) recently merged into the handbook.
- `#production` PagerDuty bot notifications will not fire as frequently. This is part of the [Alert Management](https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/163) epic.

If there's a notification type that you're missing, please bring it up with @AnthonySandoval
## Late entry - Google Support Changing soon:
GCP support will change soon:
https://gitlab.com/gitlab-com/runbooks/-/merge_requests/3035
#### [Engineering Week in Review](https://docs.google.com/document/d/1EkfzI85aqw8chYDBf2GLRvjKEa3s0FWHMI3u0DIr-xg/edit) Highlights:
>>>
Chun Du (Enablement): [A framework for the ownership of shared services and components](https://about.gitlab.com/handbook/engineering/development/#ownership-of-shared-services-and-components) was established. Everyone is welcome to initiate MRs to propose ownership models for any of the known shared services and components in this [table](https://about.gitlab.com/handbook/engineering/development/#shared-services-and-components). If any shared service or component is missing, please also feel free to add to the table.
>>>
>>>
Johnathan Hunt (VP of Security): Reminder to install Jamf on your MacBook if you have not already done so. Instructions are linked in the handbook here: https://about.gitlab.com/handbook/business-ops/team-member-enablement/onboarding-access-requests/endpoint-management/#enrolling-in-jamf
Everyone should have received an email with instructions as well. If you have any questions, you can reach out to #it_help slack channel
This is for Mac only. If you use a linux machine only, please let Karlia Kue know in #it_help
>>>
<!-- Announcements for each individual SRE Team should be made in their respective sections below. -->
# Team Updates
## Core Infrastructure
- we're almost done moving the remaining cruft in Chef so there are [no VMs](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3113) left in the _default env [2](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3114) [3](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3115).
## Datastores
* The CGroups Gitaly POC is coming to an end (for the 1st iteration). See here: https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/11648, an initial implementation of spawning git inside cgroups. After this we'll need to stop and evaluate how we want to test this at scale, leading to a possible roll out to production - if results are positive.
* We attempted the enabling of checksums in our production DB last week, which presented a couple of problems (mainly due to ansible and differences between staging and production). Jose and Nels will continue working on ironing out the process in staging this week, before trying the [change](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/2991) again early next week.
* New high priority for the Datastores team on the Gitaly front: [Costs savings OKR](https://gitlab.com/gitlab-com/gl-infra/mstaff/-/issues/21), moving [inactive Free users to HDDs](https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/12012).
* We are working through DB Performance improvements, here: https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/365
## Observability
- [Readiness Review for Jaeger](https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/11688) will require :eyes: in the next couple weeks.
- [Elasticsearch GCP self-hosted](https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/311) kickoff is 07 December 2020. We'll be engaging with development to gather requirements and align on expectations.
### New SLO Alert Names
No more `component_error_ratio_burn_rate_slo_out_of_bounds_upper` and `component_apdex_ratio_burn_rate_slo_out_of_bounds_lower` alerts. In https://gitlab.com/gitlab-com/runbooks/-/merge_requests/2980, these have been updated to more descriptive names, such as `RedisCacheServiceRailsCacheApdexSLOViolation`, `MonitoringServiceGrafanaErrorSLOViolation` and `WafServiceGitlabNetZoneErrorSLOViolation`.
### AlertManager Slack Notifications have actions

Slack alerts now have "actions". These are buttons attached to a notification that allow quick access to further details during an incident.
MRs for this change: https://gitlab.com/gitlab-com/runbooks/-/merge_requests/3025 https://gitlab.com/gitlab-com/runbooks/-/merge_requests/3020 https://gitlab.com/gitlab-com/runbooks/-/merge_requests/3007
### Further Improvements to the actions
Some alerts have a generic link to https://dashboards.gitlab.net. You can help improve these alerts by ensuring that they have a `grafana_dashboard_link` annotation, which links to the appropriate dashboard.
<!-- xxYZzXcV -->
---
# On-Call During This Period
| Schedule | Username |
| -------- | -------- |
| SRE 8-hour Americas | Alejandro Rodriguez |
| SRE 8-hour Americas | Cameron McFarland |
| SRE 8-hour APAC | Craig Miskell |
| SRE 8-hour EMEA | Henri Philipps |
## PagerDuty Incidents
<details>
* Number of incidents: **48**
<summary>Show/Hide Table</summary>
| Created | Summary |
| ------ | ------- |
| [2020-12-01T00:46:11Z](https://gitlab.pagerduty.com/incidents/P69NCL7) | [32477] Firing 1 - PostgreSQL dead tuples is too large |
| [2020-12-01T02:36:35Z](https://gitlab.pagerduty.com/incidents/P0NRBA0) | [32479] Firing 2 - IncreasedErrorRateOtherBackends |
| [2020-12-01T02:57:35Z](https://gitlab.pagerduty.com/incidents/POII0ZV) | [32481] Firing 2 - IncreasedErrorRateOtherBackends |
| [2020-12-01T03:00:26Z](https://gitlab.pagerduty.com/incidents/PGGNBLD) | [32482] Firing 1 - Postgres transactions showing high rate of statement timeouts |
| [2020-12-01T03:07:35Z](https://gitlab.pagerduty.com/incidents/P8Z36I6) | [32483] Firing 2 - IncreasedErrorRateOtherBackends |
| [2020-12-01T03:12:10Z](https://gitlab.pagerduty.com/incidents/PETAYNF) | [32484] Firing 1 - Postgres transactions showing high rate of statement timeouts |
| [2020-12-01T03:28:32Z](https://gitlab.pagerduty.com/incidents/P9PZTDY) | [32489] Firing 1 - Blackbox probes for https://registry.ops.gitlab.net are failing. |
| [2020-12-01T03:28:33Z](https://gitlab.pagerduty.com/incidents/P0E037E) | [32490] Firing 1 - Redis master missing for gitlab |
| [2020-12-01T03:29:44Z](https://gitlab.pagerduty.com/incidents/PELIXOS) | [32491] Firing 1 - Blackbox probes for https://ops.gitlab.net/users/sign_in are failing. |
| [2020-12-01T03:30:44Z](https://gitlab.pagerduty.com/incidents/P5IU3B5) | [32493] Firing 1 - Blackbox probes for https://ops.gitlab.net/users/sign_in are failing. |
| [2020-12-01T03:35:29Z](https://gitlab.pagerduty.com/incidents/PD9BQH5) | [32494] Firing 1 - Blackbox probes for https://ops.gitlab.net/users/sign_in are failing. |
| [2020-12-01T10:02:32Z](https://gitlab.pagerduty.com/incidents/PYLHXAD) | [32508] Firing 1 - The Puma Worker Saturation per Node resource of the api service (main stage), component has a saturation exceeding SLO and is close to its capacity limit. |
| [2020-12-01T12:20:32Z](https://gitlab.pagerduty.com/incidents/PYLSD2E) | [32513] Firing 1 - Last WAL was archived 20m 13s ago for env gprd. |
| [2020-12-01T18:25:45Z](https://gitlab.pagerduty.com/incidents/PL6QYN9) | [32526] Firing 1 - The grafana SLI of the monitoring service (`main` stage) has an error rate violating SLO |
| [2020-12-01T19:45:59Z](https://gitlab.pagerduty.com/incidents/P2Y3RNG) | [32528] Firing 1 - Blackbox probes for https://support.gitlab.com are failing. |
| [2020-12-01T21:07:44Z](https://gitlab.pagerduty.com/incidents/PRRWB81) | [32537] Firing 1 - Blackbox probes for https://status.gitlab.com are failing. |
| [2020-12-01T21:35:47Z](https://gitlab.pagerduty.com/incidents/PH7F12G) | [32538] Please see incident declaration in Slack channel: https://slack.com/app_redirect?channel=CB7P5CJS1&team=T02592416 |
| [2020-12-01T22:02:35Z](https://gitlab.pagerduty.com/incidents/PHDSDYD) | [32545] Firing 1 - Increased HAProxy Backend Connection Errors |
| [2020-12-01T22:02:36Z](https://gitlab.pagerduty.com/incidents/P1EMXLH) | [32546] Firing 1 - Increased Server Connection Errors |
| [2020-12-01T22:12:35Z](https://gitlab.pagerduty.com/incidents/PRMFCRO) | [32549] Firing 1 - Increased HAProxy Backend Connection Errors |
| [2020-12-01T22:17:35Z](https://gitlab.pagerduty.com/incidents/PP27D05) | [32550] Firing 1 - Increased Server Connection Errors |
| [2020-12-01T22:30:20Z](https://gitlab.pagerduty.com/incidents/PF375IC) | [32552] Firing 1 - Increased Server Connection Errors |
| [2020-12-01T22:30:20Z](https://gitlab.pagerduty.com/incidents/PBTJ56P) | [32553] Firing 1 - Increased HAProxy Backend Connection Errors |
| [2020-12-01T22:40:21Z](https://gitlab.pagerduty.com/incidents/P93GTR1) | [32554] Firing 1 - Increased HAProxy Backend Connection Errors |
| [2020-12-01T22:40:22Z](https://gitlab.pagerduty.com/incidents/P16C8K9) | [32555] Firing 1 - Increased Server Connection Errors |
| [2020-12-02T00:09:08Z](https://gitlab.pagerduty.com/incidents/PYXO0QL) | [32559] Firing 1 - thanos is restarting frequently |
| [2020-12-02T01:32:06Z](https://gitlab.pagerduty.com/incidents/P38R7DE) | [32561] Firing 1 - Some repositories are in read-only mode. |
| [2020-12-03T16:12:33Z](https://gitlab.pagerduty.com/incidents/P6R3UNF) | [32644] Firing 1 - The Puma Worker Saturation per Node resource of the api service (main stage), component has a saturation exceeding SLO and is close to its capacity limit. |
| [2020-12-03T21:32:56Z](https://gitlab.pagerduty.com/incidents/PFFS8JV) | [32657] Firing 1 - patroni-08-db-gprd.c.gitlab-production.internal postgres service appears down
|
| [2020-12-03T21:35:58Z](https://gitlab.pagerduty.com/incidents/PZEIE84) | [32658] Firing 1 - Patroni is down |
| [2020-12-03T21:50:32Z](https://gitlab.pagerduty.com/incidents/PUVYSCB) | [32662] Firing 1 - Last WAL was archived 20m 9s ago for env gprd. |
| [2020-12-03T22:31:56Z](https://gitlab.pagerduty.com/incidents/P1GJ4R7) | [32664] Firing 1 - Postgres exporter is showing errors for the last hour |
| [2020-12-03T23:36:25Z](https://gitlab.pagerduty.com/incidents/PVME9AS) | [32669] Firing 1 - WAL-E replication has stopped |
| [2020-12-04T01:14:14Z](https://gitlab.pagerduty.com/incidents/PYZTVTE) | [32672] Firing 1 - Blackbox probes for https://forum.gitlab.com/srv/status are failing. |
| [2020-12-04T02:41:55Z](https://gitlab.pagerduty.com/incidents/PJ08H3R) | [32675] Firing 1 - Unused Replication Slots for patroni-06-db-gprd.c.gitlab-production.internal |
| [2020-12-04T03:37:56Z](https://gitlab.pagerduty.com/incidents/P3AOIS1) | [32677] Firing 1 - patroni-08-db-gprd.c.gitlab-production.internal postgres service appears down
|
| [2020-12-04T03:55:32Z](https://gitlab.pagerduty.com/incidents/PNWPSPZ) | [32678] Firing 1 - Last WAL was archived 6h 25m 9s ago for env gprd. |
| [2020-12-04T04:36:56Z](https://gitlab.pagerduty.com/incidents/PJ58CWN) | [32681] Firing 1 - Postgres exporter is showing errors for the last hour |
| [2020-12-04T06:35:26Z](https://gitlab.pagerduty.com/incidents/PHX70IA) | [32690] Firing 1 - Postgres Replication lag is over 9 hours on delayed replica (normal is 8 hours) |
| [2020-12-04T06:41:56Z](https://gitlab.pagerduty.com/incidents/PZZ51I7) | [32692] Firing 1 - WAL-E replication has stopped |
| [2020-12-04T09:01:58Z](https://gitlab.pagerduty.com/incidents/PZUVO6M) | [32698] Firing 1 - Unused Replication Slots for patroni-06-db-gprd.c.gitlab-production.internal |
| [2020-12-04T09:42:56Z](https://gitlab.pagerduty.com/incidents/PHV5YHF) | [32703] Firing 1 - patroni-08-db-gprd.c.gitlab-production.internal postgres service appears down
|
| [2020-12-04T10:10:32Z](https://gitlab.pagerduty.com/incidents/PL0KSZZ) | [32704] Firing 1 - The Disk Space Utilization per Device per Node resource of the patroni service (main stage), component has a saturation exceeding SLO and is close to its capacity limit. |
| [2020-12-04T10:41:56Z](https://gitlab.pagerduty.com/incidents/P9P2WS8) | [32705] Firing 1 - Postgres exporter is showing errors for the last hour |
| [2020-12-04T11:32:40Z](https://gitlab.pagerduty.com/incidents/PPDX6XC) | [32706] Firing 1 - Postgres Replication lag is over 2 minutes |
| [2020-12-04T12:37:47Z](https://gitlab.pagerduty.com/incidents/P0SXSOX) | [32710] Firing 1 - Last WAL was archived 20m 14s ago for env gprd. |
| [2020-12-04T13:33:59Z](https://gitlab.pagerduty.com/incidents/PTC8Z12) | [32714] Firing 1 - Blackbox probes for https://staging.gitlab.com/gitlab-com/operations/issues/42 are failing. |
| [2020-12-07T04:10:14Z](https://gitlab.pagerduty.com/incidents/PG2Y475) | [32820] Firing 1 - Blackbox probes for https://customers.gitlab.com are failing. |
</details>
### 7 Day Issue Stats
* Oncall issues : **0**
* Access Request : **0**
* Change Issues : **3**
* Incident Issues : **23**
* CorrectiveAction Issues : **1**
#### Change Issues
* 2020-12-07T15:48:19Z - [Add restore_command to recovery.conf in gprd](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3154)
* 2020-12-07T13:18:46Z - [add restore_command to recovery.conf in gstg](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3153)
* 2020-12-02T22:27:58Z - [Test checksums disabling/enabling db-ops playbook in staging](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3119)
#### Incident Issues
* 2020-12-07T08:54:33Z - [2020-12-07 QA tests failing on Canary](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3151) | reliability~3760140 | ~"Service::Web" | `https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3151`
* 2020-12-07T04:15:00Z - [2020-12-07 customers.gitlab.com down](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3150) | reliability~3760139 | ~"Service::customers.gitlab.com" | `https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3150`
* 2020-12-07T00:25:05Z - [2020-12-07 Canary Workhorse Apdex below SLI](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3149) | reliability~3760142 | ~"Service::Workhorse" | `https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3149`
* 2020-12-04T13:40:25Z - [2020-12-04 staging HTTP500's after latest deploy](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3144) | reliability~3760140 | ~"Service::Web" | `https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3144`
* 2020-12-04T00:53:20Z - [2020-12-04 Permission error: Anyone can push to public projects](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3141) | reliability~3760139 | ~"Service::Git" | `https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3141`
* 2020-12-03T15:53:20Z - [2020-12-03: component shared_runner_queues of ci-runners service has an apdex-scope outside of SLO](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3137) | reliability~3760140 | ~"Service::CI Runners" | `https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3137`
* 2020-12-03T15:06:19Z - [2020-12-03 CMOC Practice Incident](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3136) | reliability~3760142 | | `https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3136`
* 2020-12-03T12:32:32Z - [2020-12-03: pgbouncer saturation causing sidekiq latencies](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3135) | reliability~3760141 | ~"Service::Pgbouncer" | `https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3135`
* 2020-12-03T12:15:04Z - [2020-12-03: Gitaly latency drop on canary](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3134) | reliability~3760141 | ~"Service::Gitaly" | `https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3134`
* 2020-12-03T11:28:31Z - [2020-12-03: short canary web latency spikes](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3133) | reliability~3760142 | ~"Service::Web" | `https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3133`
* 2020-12-03T10:19:10Z - [2020-12-03: fluentd log output error rate SLO violation](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3132) | reliability~3760142 | ~"Service::Logging" | `https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3132`
* 2020-12-03T08:59:17Z - [2020-12-03: increased git update-ref errors](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3131) | reliability~3760142 | ~"Service::Gitaly" | `https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3131`
* 2020-12-03T03:02:59Z - [2020-12-03 Merge request permission error: A non-member can merge an MR to master](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3129) | reliability~3760139 | ~"Service::GitLab Rails" | `https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3129`
* 2020-12-02T22:04:41Z - [2020-12-02: The workhorse_auth_api SLI of the git service (`cny` stage) has an apdex violating SLO](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3118) | reliability~3760141 | ~"Service::Git" | `https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3118`
* 2020-12-02T11:58:08Z - [2020-12-02: A feature flag caused wrong merge commit parent order](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3112) | reliability~3760141 | ~"Service::Gitaly" | `https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3112`
* 2020-12-02T08:13:12Z - [2020-12-02: Creating Epics failing on 2 namespaces](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3111) | reliability~3760142 | ~"Service::Web" | `https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3111`
* 2020-12-02T01:36:01Z - [2020-12-02 Some repositories are in read-only mode on praefect-file01](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3110) | reliability~3760141 | ~"Service::Praefect" | `https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3110`
* 2020-12-01T22:23:44Z - [2020-12-01 Increased HAProxy Backend Connection Errors (fe-registry-03)](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3108) | reliability~3760141 | ~"Service::Container Registry" | `https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3108`
* 2020-12-01T21:35:49Z - [2020-12-01 - Cannot push - gitaly RPC error](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3107) | reliability~3760141 | ~"Service::Praefect" | `https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3107`
* 2020-12-01T17:20:22Z - [2020-12-01: Failed builds for `gitlab-com/www-gitlab-com`](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3104) | reliability~3760142 | ~"Service::CI Runners" | `https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3104`
* 2020-12-01T10:54:25Z - [2020-12-01: Puma saturation alerts during deployment](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3103) | reliability~3760142 | ~"Service::API" | `https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3103`
* 2020-12-01T03:01:06Z - [2020-12-01 Query timeouts](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3101) | reliability~3760141 | ~"Service::Postgres" | `https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3101`
* 2020-12-01T00:54:08Z - [2020-12-01 Dead Tuples from Ci MInutes reset](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3100) | reliability~3760142 | ~"Service::Postgres" | `https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3100`
#### CorrectiveAction Issues
* 2020-12-03T20:59:22Z - [Analyze registry GCS libraries and timeout settings](https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/12064)
* 2020-12-02T16:02:53Z - [Validating if we have any repos owned by root.root in our Gitaly File-servers in production](https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/12049)
* 2020-12-01T09:50:57Z - [Improve sensitivity of notification failure alerting](https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/12030)
* 2020-12-01T03:56:33Z - [Determine how to support the Ci::BatchResetMinutesWorker monthly job without melting the database](https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/12027)
### Open Issue Stats
* [Oncall issues](https://gitlab.com/gitlab-com/infrastructure/issues?scope=all&utf8=%E2%9C%93&state=opened&label_name[]=oncall) : **9**
* [Change issues](https://gitlab.com/gitlab-com/production/issues?scope=all&utf8=%E2%9C%93&state=opened&label_name[]=change) : **2**
* [Incident issues](https://gitlab.com/gitlab-com/production/issues?scope=all&utf8=%E2%9C%93&state=opened&label_name[]=incident) : **35**
* [Access Request](https://gitlab.com/gitlab-com/infrastructure/issues?scope=all&utf8=%E2%9C%93&state=opened&label_name[]=access%20request) : **3**
* [CorrectiveAction](https://gitlab.com/gitlab-com/infrastructure/issues?scope=all&utf8=%E2%9C%93&state=opened&label_name[]=corrective%20action) : **132**
#### Open Change Issues
<details>
<summary>Show/Hide Table</summary>
| Created | Summary |
| ------- | ------- |
| [2020-12-07T15:48:19Z](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3154) | Add restore_command to recovery.conf in gprd |
| [2020-10-22T09:38:01Z](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/2874) | Enable gitaly_go_user_merge_branch feature flag |
</details>
#### Open Incident Issues
<details>
<summary>Show/Hide Table</summary>
| Created | Summary |
| ------- | ------- |
| [2020-12-01T17:20:22Z](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3104) | 2020-12-01: Failed builds for `gitlab-com/www-gitlab-com` |
</details>
#### Open Oncall Issues
<details>
<summary>Show/Hide Table</summary>
| Created | Summary |
| ------- | ------- |
| [2020-10-27T14:20:44Z](https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/11743) | One-Time Export for micro_x |
| [2020-10-06T21:02:20Z](https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/11548) | Remove Repository Artifacts |
| [2020-09-14T18:52:09Z](https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/11363) | PS Congregate VM for GitHost to GitLab.com Migration - Afilias |
| [2020-09-02T13:47:51Z](https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/11244) | disable-chef-client isn't preserved over reboots |
| [2020-08-18T07:29:24Z](https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/11122) | Potentially obsolete alerts |
| [2020-08-11T16:39:37Z](https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/11098) | Investigate slow child pipeline triggering on pre.gitlab.com |
| [2020-07-28T18:19:35Z](https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/10957) | PS Congregate VM for BitBucket Server to GitLab.com Migration |
| [2020-07-28T17:43:40Z](https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/10956) | Project Import Request - ciorg/bridge/am-child-pool/api |
| [2020-03-30T13:38:11Z](https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/9660) | jobs.gitlab.com cert expired unnoticed on 2020-03-28 |
</details>
issue