Improve DB incident management
Database incidents typically have high severity. It is very important to handle those incidents as efficient as possible. Some possible improvements:
- make it easy to escalate and reach out for support/help
- better documentation on how to page OnGres (e.g. on the Incident Management handbook)
- add paging instructions to #ongres-gitlab slack channel title
- ways to reach out to database team on weekends?
Designs
- Show closed items
Relates to
- #11832EOC Queue Q2 2021
Activity
-
Newest first Oldest first
-
Show all activity Show comments only Show history only
- Henri Philipps added ServicePostgres corrective action workflow-infraTriage + 1 deleted label
added ServicePostgres corrective action workflow-infraTriage + 1 deleted label
- Henri Philipps marked this issue as related to production#2885 (closed)
marked this issue as related to production#2885 (closed)
- Henri Philipps mentioned in issue production#2885 (closed)
mentioned in issue production#2885 (closed)
- Alberto Ramos added DStores-BacklogDBOperations label
added DStores-BacklogDBOperations label
- Dave Smith mentioned in issue #11749 (moved)
mentioned in issue #11749 (moved)
- Alberto Ramos changed the description
Compare with previous version changed the description
We can start with this little improvement @hphilipps : gitlab-com/www-gitlab-com!66968 (merged)
Now if we look for Ongres in the handbook we'll find that entry immediately.
However we should continue to point #2 (closed) up there, that'd be the most important one.
- Alberto Ramos added workflow-infraIn Progress label and removed workflow-infraTriage label
added workflow-infraIn Progress label and removed workflow-infraTriage label
- Alberto Ramos marked this issue as related to #11832 (closed)
marked this issue as related to #11832 (closed)
This is related to point #2 (closed) from the Description: https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/11832
- Alberto Ramos removed workflow-infraIn Progress label
removed workflow-infraIn Progress label
- Alberto Ramos added workflow-infraTriage label
added workflow-infraTriage label
- Alberto Ramos added severity3 label
added severity3 label
- Brent Newton added teamReliability label and removed 1 deleted label
added teamReliability label and removed 1 deleted label
- Brent Newton added workflow-infraReady label and removed workflow-infraTriage label
added workflow-infraReady label and removed workflow-infraTriage label
- Anna Liisa Moter added workflow-infraUnder Review label and removed workflow-infraReady label
added workflow-infraUnder Review label and removed workflow-infraReady label
- Owner
This issue still seems valid, although we need to confirm if Ongress is still available for support. Moving it to Ready.
- Marcel Chacon added workflow-infraReady label and removed workflow-infraUnder Review label
added workflow-infraReady label and removed workflow-infraUnder Review label
- Anna Liisa Moter added 1 deleted label
added 1 deleted label
- Maintainer
Hi @hphilipps,
We have deprecated the use of the
ca::refined
label in favor ofworkflow-infra::Ready
, so this issue has been relabeled to that effect.See gitlab-com/www-gitlab-com!100690 (merged) for more info.
- 🤖 GitLab Bot 🤖 removed 1 deleted label
removed 1 deleted label
- Anthony Fappiano added ReliabilityP4 label
added ReliabilityP4 label
- Maintainer
Hi @hphilipps,
We're updating the workflow-infra scoped label to workflow-infraTriage because it meets the following criteria:
- State is OPEN
- It is a corrective action
- It is assigned to teamReliability
- It is workflow-infraReady
- There has been no activity for at least 3 months (since 2022-08-08T11:19:15.242Z)
Perhaps the reason for inactivity is that this corrective action is missing information? Please review and either close the corrective action if it is no longer relevant, or update it and label as workflow-infraReady.
If you have any questions, please reach out to an Infrastructure Manager.
Thanks for your help!
🖤
You are welcome to help improve this comment.
- 🤖 GitLab Bot 🤖 added workflow-infraTriage label and removed workflow-infraReady label
added workflow-infraTriage label and removed workflow-infraReady label
Closing as I don't think it is relevant anymore.
- Anthony Fappiano closed
closed
- 🤖 GitLab Bot 🤖 added workflow-infraDone label and removed workflow-infraTriage label
added workflow-infraDone label and removed workflow-infraTriage label