Update GitLab Dedicated support documentation with comprehensive tenant status information
Summary
This MR updates the GitLab Dedicated support documentation with comprehensive tenant status information to help support engineers better understand and assist customers with instance status issues.
Background
This documentation update supports the Switchboard tenant status feature implementation, which provides different levels of information visibility:
- Customer view: High-level status indicators (Normal, Degraded Performance, Service Disruption) and maintenance notices
- Internal GitLab view: Additional detailed incident information from incident.io for support engineers and SREs
This enhanced internal visibility allows support teams to see active and recent incidents with full details from incident.io, while customers see only the appropriate high-level status information.
Related work: Show active and recent incidents for a tenant in Switchboard
Changes
Updates three key support workflow pages:
1. /handbook/support/workflows/dedicated/
- Adds detailed explanations of Switchboard status indicators
- Provides context for Normal, Degraded Performance, and Service Disruption states
- Includes maintenance indicators and important notes for support engineers
2. /handbook/support/workflows/dedicated_instance_health/
- Enhances Grafana monitoring guidance with status correlation information
- Adds context for interpreting metrics in relation to tenant status
3. /handbook/support/workflows/dedicated_switchboard/
- Expands Switchboard troubleshooting with comprehensive status information
- Provides detailed guidance on interpreting tenant status indicators
Content Details
The updates include:
Status Indicators
- Normal: No active S1/S2 incidents, instance operating as expected
- Degraded Performance: Active S2 incidents affecting core functionality
- Service Disruption: Active S1 incidents with services fully down
Maintenance Indicators
- Scheduled Maintenance: During planned maintenance windows
- Emergency Maintenance: During critical, time-sensitive updates
Support Engineer Guidance
- What incidents aren't displayed (S3/S4, non-impacting lifecycle stages)
- Important notes about SLA calculations and sync timing
- How to correlate status with customer reports and troubleshooting steps
- Understanding the enhanced incident visibility available to internal teams
Benefits
This provides GitLab Support engineers with:
- Clear understanding of what each status means for both customer and internal views
- Better context for customer communications based on detailed incident information
- Improved troubleshooting guidance based on current instance status
- Consistent information across all Dedicated support documentation
- Knowledge of the enhanced incident details available through the internal Switchboard interface
Edited by Loryn Bortins