Skip to content

Limit PagerDuty Notifications in `#support_self-managed` to Triggered Events

GitLab Support: Process Change Rollout Plan

The Story

We have a PagerDuty Slack integration that sends notifications about incidents to the Slack channel #support_self-managed. Over time, the verbosity of these notifications has increased, potentially due to PagerDuty upgrades or simply an increase in the number of incoming emergencies.

Because #support_self-managed is intended to be used for discussion of Self-Managed GitLab topics. Because all incident notifications are also sent to #spt_on-call, we will limit the type of PagerDuty notifications sent in #support_self-managed to only triggered events.

This was discussed on #5040 (closed), where we agreed to only send Triggered events in the #support_self-managed channel.

The goal is to reduce noise in this channel in order to make it easier to collaborate on Self-Managed topics again.

The Roles

Role Description
Champions @klang @manuelgrabowski
Users All Support Engineers in all regions
Impacted Non-Users Any GitLab team members who consume the PD alerts, like CSMs

Schedule

  • Rollout to begin on 2023-05-01
  • Rollout will be a one-time change and therefore will not be phased
  • Adoption complete by 2023-05-01

Training

Training is not required, but we will communicate the change so that SEs are prepared and aware of alternatives.

Success Determination

Success

  • What will success look like?

Fewer PagerDuty notifications in the #support_self-managed Slack channel

Action Plan

  1. Create an item in the SWIR to announce the change and include The Story on 2023-04-20
    • NOTE: On the SWIR form, add the Manager Attention tag for policy changes and action-items for Support Managers specifically (you can add multiple tags to a SWIR item)
  2. Post a message in the #support_team-chat Slack channel (or other support channel as appropriate) announcing the change and pointing to the SWIR announcment on 2023-04-24
  3. Announce the change and tell The Story in Team meetings by 2023-04-28
    • EMEA team meeting
    • AMER team meeting
    • APAC team meeting
  4. Other communications channels
    • Discuss in 1-1s, telling The Story, by 2023-04-28
    • Other communications channels, if required - for example, post to a CSM Slack channel if the CSMs are "impacted non-users"
  5. Report back on change adoption, concerns, etc. by 2023-04-28

Manager Acknowledgement Section

Expectations

  • Your acknowledgement is required for this change that takes effect on date 2023-05-01. Check off your name to indicated you have reviewed the information, and will share it with your team. Ensure that your communication includes the following:
  1. The change date of 2023-05-01
  2. Reminder that all update notifications are still sent to #spt_on-call

Due Date

Check off your name by midnight UTC on: 2023-04-28.

Names

Support Managers

Sorted alphabetically by region / GitLab handle

AMER + USFed

APAC

EMEA

Ops

  • @jcolyer (Support Operations will need to implement the change)

Senior Management team

Senior Management has already provided their approval on #5040 (comment 1338895318)

Follow-Up Plan

  • How will results be captured? By whom and by when?

We will not plan to capture actual data following the change, but will observe the #support_self-managed to see if collaboration levels increase. Even if they do not increase, we will not plan to revert the change.

  • What is the plan for considering and making quick improvements?

Following the change, we will be attentive to any new issues this may introduce. The biggest concern (that I can see) is that there will be too few notifications following the change. I don't think this will be an issue because 1) PagerDuty dynamically updates the incident notifications as we work, and 2) all incident updates are still sent to #spt_on-call

  • What is the plan should the change be deemed unsuccessful?

We will simply re-enable the disabled update types. This will be our rollback plan.

Edited by Tine Sørensen