Limit PagerDuty Notifications in `#support_self-managed` to Triggered Events
GitLab Support: Process Change Rollout Plan
The Story
We have a PagerDuty Slack integration that sends notifications about incidents to the Slack channel #support_self-managed
. Over time, the verbosity of these notifications has increased, potentially due to PagerDuty upgrades or simply an increase in the number of incoming emergencies.
Because #support_self-managed
is intended to be used for discussion of Self-Managed GitLab topics. Because all incident notifications are also sent to #spt_on-call
, we will limit the type of PagerDuty notifications sent in #support_self-managed
to only triggered events.
This was discussed on #5040 (closed), where we agreed to only send Triggered
events in the #support_self-managed
channel.
The goal is to reduce noise in this channel in order to make it easier to collaborate on Self-Managed topics again.
The Roles
Role | Description |
---|---|
Champions | @klang @manuelgrabowski |
Users | All Support Engineers in all regions |
Impacted Non-Users | Any GitLab team members who consume the PD alerts, like CSMs |
Schedule
- Rollout to begin on
2023-05-01
- Rollout will be a one-time change and therefore will not be phased
- Adoption complete by
2023-05-01
Training
Training is not required, but we will communicate the change so that SEs are prepared and aware of alternatives.
Success Determination
Success
- What will success look like?
Fewer PagerDuty notifications in the #support_self-managed
Slack channel
Action Plan
-
Create an item in the SWIR to announce the change and include The Story on 2023-04-20
- NOTE: On the SWIR form, add the
Manager Attention
tag for policy changes and action-items for Support Managers specifically (you can add multiple tags to a SWIR item)
- NOTE: On the SWIR form, add the
-
Post a message in the #support_team-chat
Slack channel (or other support channel as appropriate) announcing the change and pointing to the SWIR announcment on2023-04-24
-
Announce the change and tell The Story in Team meetings by 2023-04-28
-
EMEA team meeting -
AMER team meeting -
APAC team meeting
-
-
Other communications channels -
Discuss in 1-1s, telling The Story, by 2023-04-28
-
Other communications channels, if required - for example, post to a CSM Slack channel if the CSMs are "impacted non-users"
-
-
Report back on change adoption, concerns, etc. by 2023-04-28
Manager Acknowledgement Section
Expectations
- Your acknowledgement is required for this change that takes effect on date 2023-05-01. Check off your name to indicated you have reviewed the information, and will share it with your team. Ensure that your communication includes the following:
- The change date of
2023-05-01
- Reminder that all update notifications are still sent to
#spt_on-call
Due Date
Check off your name by midnight UTC on: 2023-04-28
.
Names
Support Managers
Sorted alphabetically by region / GitLab handle
AMER + USFed
APAC
EMEA
Ops
-
@jcolyer (Support Operations will need to implement the change)
Senior Management team
Senior Management has already provided their approval on #5040 (comment 1338895318)
Follow-Up Plan
- How will results be captured? By whom and by when?
We will not plan to capture actual data following the change, but will observe the #support_self-managed
to see if collaboration levels increase. Even if they do not increase, we will not plan to revert the change.
- What is the plan for considering and making quick improvements?
Following the change, we will be attentive to any new issues this may introduce. The biggest concern (that I can see) is that there will be too few notifications following the change. I don't think this will be an issue because 1) PagerDuty dynamically updates the incident notifications as we work, and 2) all incident updates are still sent to #spt_on-call
- What is the plan should the change be deemed unsuccessful?
We will simply re-enable the disabled update types. This will be our rollback plan.