Integrate Customer Support as Tier 1 customers for Tier 2 escalations
Overview
We need to establish Customer Support as Tier 1 customers for Tier 2 escalations. This requires defining clear escalation criteria that balance customer needs with engineering work-life balance.
Business Context
- Business decision has been made to make Support eligible as a Tier 1 customer
- Similar decision assumed for Dedicated team who have expressed interest
- Current escalation through #dev-escalation process has unpredictable results
- Weekend emergencies are often upgrade/migration related (Gitaly, Database hotspots)
Key Considerations
Weekend Coverage Impact
- Tier 2 will eventually cover weekends, requiring engineers to pause personal plans
- Need to minimize weekend work while maintaining customer service quality
- Risk of creating always-on weekend culture as more Tier 1 teams are added
- Most support calls requiring dev help occur on weekends
Current Team Commitments
Engineering teams are managing:
- Engineering allocations
- Customer interlock commitments that cannot be missed
- Long-term feature work customers are counting on
- All would be impacted by Tier 2 on-call pings
Scaling Challenges
- Significant jump from 1 to 3 Tier 1 customers
- Need to prevent escalation fatigue while ensuring critical issues get addressed
Proposed Escalation Criteria
Immediate Tier 2 Escalation Required:
- Emergency situation where Support has exhausted their knowledge
- Customer cannot achieve stable production environment in reasonable timescales
- Customer has deadline to meet for production system recovery
- Rollback is not an option due to rigid change windows
- S1 incidents that cannot be resolved within X minutes (TBD)
Can Go Through RFH Process:
- Non-emergency situations where customer agrees to delay
- Issues that can wait for normal business hours
- Situations where temporary fix can be applied with RFH for long-term solution
Implementation Requirements
Process Documentation
-
Update handbook with clear escalation criteria -
Document current Support emergency process for Tier 2 visibility -
Include customer events calendar information sharing -
Add training materials for customer call engagement
Tooling & Access
-
Ensure Support team has proper incident.io access and roles -
Set up proper PagerDuty integration for Tier 2 notifications -
Include Tier 2 in weekend customer event notifications
Monitoring & Improvement
-
Implement retro issue for every emergency requiring Dev assistance -
Track escalation patterns to identify improvement opportunities -
Monitor weekend vs weekday escalation ratios -
Create feedback loop for better documentation/error messages
Guardrails
-
Maintain Support Manager on Call approval for escalations where possible -
Define clear handoff procedures between Support and Engineering -
Establish customer communication protocols during escalations
Success Metrics
- Reduced unpredictable escalation experiences
- Maintained customer satisfaction during emergencies
- Sustainable weekend coverage without burnout
- Clear documentation preventing repeat escalations for same issues
Next Steps
- Finalize specific escalation criteria with Support team
- Update handbook documentation
- Set up technical integrations (incident.io, PagerDuty)
- Establish retro process for continuous improvement
- Monitor initial rollout and adjust criteria as needed
Related Links
Edited by Darva Satcher