Phase 1 - Incident Routing for Cells On-Call
## Summary
This is the second phase of the [project to define an on-call process for Cells](https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/1787). The Incident Routing phase will define how alerts are routed to pagers, and how we create incidents from them.
## DRI
@devin
## Objectives
1. Alertmanager Integration with incident.io for automated incident creation - https://gitlab.com/gitlab-com/gl-infra/production-engineering/-/issues/28061
2. Cells team Tier 2 Escalation: https://gitlab.com/gitlab-com/gl-infra/production-engineering/-/work_items/28005
## Deliverables
### Tier 2 Rotation
- Best effort Tier 2 rotation schedule
- Escalation path which can be selected by EOC
### Alertmanager Integration
- incident.io configured as an alertmanager destination
- Metadata added to alerts so that incident.io can filter and route them
## Key Questions to Answer
- Will we set up Alertmanager routing just for Cells, or standardize it across all of the Dedicated tooling
-
## Exit Criteria
- [ ] Tier 2 - Cells Rotation Schedule
- [ ] Cells Alertmanager can send alerts to incident.io
## Timeline
Target: 2-3 weeks of working time to allow adequate implementation time before Protocells launch. The due date takes into account the PTO scheduled around the holidays.
## Related Links
- [Parent Epic: Establish On-Call Process for GitLab Cells (#1787)](https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/1787)
- [Protocells Epic (#1616)](https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/1616)
- [Cells Architecture Design](https://handbook.gitlab.com/handbook/engineering/architecture/design-documents/cells/)
## Issue Admin
```
/labels ~"group::Networking & Incident Management" ~"workflow-infra::Triage"
```
<!-- STATUS NOTE START -->
## Status 2026-02-19
:clock1: **total hours spent this week by all contributors**: 4
:tada: **achievements**:
- Cells team is set up as a best effort Tier 2 on-call rotation, so we can page them when we're ready
:warning: **change in plan**
- The cells team is now targeting [Q3 for Protocells](https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/1787#note_3099229073) for production. This will push creation of the documentation into Q2, and temporarily reduce their focus on providing the things Incident Management needs to proceed.
:issue-blocked: **blockers**:
- Waiting for Cells and Observability teams to [finalize sending Cells alerts to incident.io](https://gitlab.com/gitlab-com/gl-infra/tenant-scale/cells-infrastructure/team/-/issues/616#note_3003480408) so we can take action on them
- Also waiting on the creation of a [LevelUP Training for the EOC's](https://gitlab.com/gitlab-com/gl-infra/tenant-scale/tenant-services/team/-/work_items/357)
_Copied from https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/1789#note_3095952419_
<!-- STATUS NOTE END -->
epic