"All Regions" Ticket Workflow Refinement and Iteration

With the deployment of the globally aligned Zendesk views, combined with the push to get 100% ticket assignment for all tickets, we need to iterate on how to work tickets with "All Regions" as the preferred region of support.

This issue will be used to define the first iteration of the All Regions ticket workflow, with the output being an MR updating the handbook with the workflow.

Update: 2021-06-30

Thank you for the discussion and proposals below. Shaun and Lee discussed the current state and have the following pieces in mind which will still need more refinement.

Our current proposal is relaxing (date of change TBD) the next response time clock for All Regions tickets with a priority of Medium to 24 hours. We do not intend to change the NRT SLO timer for All Regions tickets with a priority of High at this time. The change in the next response time SLO clock enables an All Regions ticket with priority low or medium to be worked in the same way as a Preferred Region ticket, since the priority of both Medium and Low equates to a customer having a problem that is not blocking them. This means that they need to be assigned on First Response, with the understanding that the priority of the ticket can and should be changed if necessary. In the short term, this reduces the hand over burden, makes it easier for one engineer to lead the ticket, and gives that engineer space to set expectations and meet them.

Because of this change, the focus of this issue is switching to what we need to do for All Regions tickets with a High priority. An integral part of this workflow is how we want to manage handovers on tickets where a customer is blocked with GitLab being highly degraded, and an ongoing focus needs to be maintained to get them back into a working state. We think that focusing our efforts on high first will help us create the right process for scenarios that need global handover. If that eventually scales to all regions medium/low we will see depending on the needs and processes that form.

We now have the intra-ticket feedback form that gives us the ability to monitor for critical feedback, and if needed, we can evolve this approach. While this decision is large in that it affects all of support, it is still a 2 way door decision that we could adjust in the future if needed.

As for handovers there are three basic strategies that have been presented:

manager-owned
engineer-led
bot-managed

Shaun and Lee are still discussing what makes sense for v1, with possible future iterations evolving the process.

For background on the comfort in changing the timer for Medium priority tickets, our commitment as specified in our Subscription Agreement is for First Response Times. We have clarified the wording in the Priority Support table that when we talk about SLA's, they are specific to and contained by our First Response Time commitments.

As an initial proposal for what the workflow could look like in practice:

Support Engineer assigns ticket to themself
Confirm with the customer the Priority of the problem according to our definitions of support
Confirm with the customer if they require around the clock attention on the ticket
- If No confirm which preferred region for support the ticket can be changed to
At the end of their day, the Support Engineer unassigns the ticket from themself so that it shows in the queue as an unassigned ticket
- If priority is High, find a new assignee in Slack for a warm handover (managers can assist with this, especially if timezone gap means there is little to no overlap with the start of the next regions' day)
Through the life of the ticket, adjust Priority level appropriately

This is very high level and is meant as a starting point for what the workflow could look like

Edited Jul 02, 2021 by Shaun McCann