How could we improve handling of multiple, simultaneous incidents?
Why
When we have multiple incidents, it's difficult to track them all. A single zoom call can be helpful when the individuals on call are handling all of them and context switching (which has its own pros and cons). It's also an easy way for people to join when an incident is going on. Unfortunately, when we have multiple efforts going on, people often join "the incident room" only to have to gather context, realise that a different incident is being focussed on, and then leave to work out what's going on the the incident they're tracking.
What
When an incident is created, we already generate the issue and a slack channel that are unique. These are shared in the incident management slack channel by the woodhouse bot. Consider generating a unique incident-call-url per incident or another strategy to focus the discussion.
Proposals
Generally I'm thinking about having separate incident response when we have simultaneous S1 or S2 incidents.
- We could create several incident zoom rooms and cycle through a static list
Severity 1
andSeverity 2
incidents. We could update the template for sharing incident details in the various places we share that information. We could let the team know that they should check on S1 & S2 incidents for a specific zoom room. - We could generate new incident zoom rooms for each new S1 or S2 incident if one is already active and keep the existing incident room as the default.
- We could continue with the current incident call and create breakout rooms for simultaneous S1/S2 incidents. We'd need to check the functionality as people would need to come and go from breakout rooms and the main room without intervention from a host/co-host.
- [your suggestion]
Drawbacks
- Having multiple zoom rooms when one individual is on call for their role means that people would need to switch between different zoom rooms. This would likely create extra complexity with incident response.
- We would need to consider paging backup individuals in key roles (Communication Manager, Engineer on Call, Incident Manager, others).