Tier 2 SME Fulfillment onboarding
Tier 2 SME Group Onboarding for Incident Response
Summary
- Group Name: Fulfillment
- Group Manager / DRI: @jameslopez
- Slack Channel: #s_fulfillment_engineering
Tier-2 SME on-call ( Level 1 )
Tier 2 rotations are for subject matter experts. On average, they should know more about their subject matter than engineers outside of the group.
Tier-2 SME on-call (Level 1) Members
Tier-2 SME on-call (Level 1) Schedule
All times are in UTC
Name: EMEA
- Times: M T W T F : 07:00 - 15:00
- Handover Time: 07:00
- Change Shifts: Weekly, Monday 07:00
- Members:
- Ammar Alakkad
- Angelo Gulina
- Bishwa Hang Rai
- Corinna Gogolok
- Divya Mahadevan
- Kos Palchyk
- Lukas Wanko
- Michael Lunoe
- Paulo Barros
- Roy Zwambag
- Sharmad Nachnolkar
- Sheldon Led
- Shreyas Agarwal
- Vijay Hawoldar
- Vitaly Slobodin
Name: AMER
- Times: M T W T F : 15:00 - 23:00
- Handover Time: 15:00
- Change Shifts: Weekly, Monday 07:00
- Members:
- Aishwarya Subramanian
- Etienne Baque
- Jason Goodman
- Jorge Cook
- Katherine Richards
- Minahil Nichols
- Ryan Cobb
- Tyler Amos
- Valerie Burton
- Vladlena Shumilo
Name: APAC
- Times: M T W T F : 23:00 - 07:00
- Handover Time: 23:00
- Change Shifts: Weekly, Monday 23:00
- Members:
- Abhay V Ashokan
- Aman Luthra
- Josianne Hyson
- Matt Sroufe
- Qingyu Zhao
- Suraj Tripathi
- Tarun Vellishetty
- Vamsi Vempati
Tier-2 SME Escalation on-call ( Level 2 )
Folks in this level will be paged when the initial page to Level 1 is not acknowledged within 15 minutes Rotation owners must be in the escalation path for their rotations.
- Choose one:
-
Round-robin all team members after 15 minutes (No need to fill the Level 2 template below) -
New group of a small amount of folks -
Shared escalation schedule consisting of rotation leaders -
Fill the current onboarding issue with the Escalation oncall schedule if it is not defined yet and remove the below schedule template -
Link to the schedule in incident.io where this Escalation Schedule is defined : schedule here
-
-
Tier-2 SME Escalation oncall (Level 2) Members
- Member 1
- Member 2
Tier-2 SME Escalation oncall (Level 2) Schedule
Escalation Path
flowchart TD
%% Starting point
incident[Incident Escalated to Team]
%% Level 1 - Tier 2 SME On-call
level1[Level 1: Page Current SME On-call<br/>Based on schedule rotation]
%% Acknowledgment check
ack_level1{Acknowledged<br/>within 15min?}
%% Level 2 escalation
level2[Level 2: SME Escalation On-call<br/>Paged after no Level 1 response]
%% Schedule note
schedule_note[Note: Each level requires<br/>its own incident.io schedule]
%% Final escalation
level2_ack{Level 2<br/>Acknowledged?}
final_escalation[Further Escalation<br/>DRI/Manager involvement]
%% Success path
resolved[Incident Handled]
%% Flow connections
incident --> level1
level1 --> ack_level1
ack_level1 -->|Yes| resolved
ack_level1 -->|No| level2
%% Level 2 handling
level2 --> level2_ack
level2_ack -->|Yes| resolved
level2_ack -->|No| final_escalation
%% Schedule note positioning
level2 -.-> schedule_note
%% Styling
classDef level1 fill:#fff3e0
classDef level2 fill:#fce4ec
classDef decision fill:#f3e5f5
classDef success fill:#e8f5e8
classDef escalation fill:#ffebee
classDef note fill:#f5f5f5,stroke:#999,stroke-dasharray: 5 5
class level1 level1
class level2 level2
class ack_level1,level2_ack decision
class resolved success
class final_escalation escalation
class schedule_note note
The default escalation path can be changed. Time intervals can be adjusted, and notification options are not fixed. If unsure, the defaults should be a reasonable starting point.
DRI Checklist
-
Go through the Rotation Leader LevelUp channel for detailed instructions on how to onboard your team (optional) -
Finalize On-call team members for each level -
Fill the schedule section for Level 1 and Level 2 above in this issue. -
If any of the members are part of the Incident Manager on-call rotation, please create an issue like this example here to have them removed (where possible) from the IM rotation. -
If any of the team members are part of the Dev on-call rotation, please add their emails to the Excluded Team Member Emailstab with the name of the rotation underreasonin the eligibility spreadsheet to exclude them from the rotation. -
Decide on an escalation option to move forward with the Level 2 in the escalation chain
-
Note: As the rotation leader/owner you must exist in the escalation chain , you can either include yourself in Level 2 or add an extra step to page yourself in case a page to Level 2 goes unacknowledged as well
-
Oncall license Setup and access -
Use Slack command /requestto raise a request in Lumos for yourself to getOn Call Scheduleraccess to be able to set the rotation on incident.io -
Ensure each team member has Full accessin the "on-call seat" column on the incident.io users page, verify here. If not request the Networking & Incident Management team to provide it for any team members who need it by pinging a member of the Networking & Incident Management team on the issue. DO NOT USE THE ACSESS REQUEST TEMPLATE process for this. This is not granting permission, this is granting a full access license (for billing purposes) for that user to use the on-call features.
-
-
Setup Schedules and Escalation path -
Once the schedule section above in this issue is filled create a Schedulefor your team using incident.io. To do so you can duplicateSAMPLE tier2 - TEAMNAMEschedule and edit it as per your requirements, add the members accordingly. For the schedule name, use the formattier2 - <team name>. This is your SME on-call schedule (Level 1) -
Setup SME escalation oncall (Level 2) and escalation path based on option chosen, Rotation owners must be in the escalation path for their rotations some options include creating a third level in the escalation path to page you after no response from Level 2 , or you can be a member of the Level 2 rotation
Level 2 option chosen: Round-robin
-
Navigate to Escalation pathsin incident.io UI, duplicatetier2 - team_name - Round_robinescalation path for reference and edit it as per your requirements, ensure to add your Schedules to the Escalation path. For the Escalation Path name use the formattier2 - <team name> -
Refer the Round-robin incident.io doc to figure out the best way to implement this for your team. A good starting point would be to cycle through the responders every 10 minutes and time out after 60 minutes of it to go to the next step in the escalation path
Level 2 option chosen: New small group of folks / Shared escalation schedule consisting of rotation leaders
-
Create a new Schedule for the SME escalation on-call (Level 2) based on the schedule information -
Navigate to Escalation pathsin incident.io UI, duplicatetier2 team_name - Small_group / Shared rotation leaderescalation path for reference and edit it as per your requirements, ensure to add your Schedules to the Escalation path. Add the SME on-call schedule in Level 1, add the Escalation SME on-call schedule in Level 2. For the Escalation Path name use the formattier2 - <team name>
Note: In the Escalation Path on incident.io, Notify represents Paging the folks on the schedule, Notify on Slack Channel will simply notify them on Slack
-
Prepare team for On-call -
Inform rotation members to ignore notifications about upcoming on-call shifts , with a message like below
Hi, you'll be getting a notification about upcoming on-call shifts. Do not worry, you will not be paged yet. We will only activate the rotation on date X. Any shifts scheduled before that are just for us to test the setup and prepare for the go-live-
Instruct your team members to set their notification preferences in the incident.io ui, this represents how they wish to be informed when they are paged -
While it's not mandatory it is recommended to have the incident.io app installed on the member's mobile device -
Share related handbook links with the rotation members -
Review the On-Call Readiness dashboard -
Instruct the team members to finish the Tier-2 levelup course
-
-
Go live! -
On the due date update the On-call teams catalog with the name ( tier2 - <team name>) and escalation path of your team. Each row in the catalog helps populate the drop-down menu that EOCs will use to select to page the required team. -
Announce in the #eoc-generalthat your team is ready to be paged, give a high-level description for this SME group's covered areas and this handbook link
-
Congratulations you are now ready to be on-call !