Checklist for CustomersDot outages
Current documentation for CustomersDot does not have a crucial part for steps or pointer what to do in case of CustomersDot outage. That missing section makes any incident workflow complicated as it involves a Fulfillment engineer into it.
We should add a checklist that can be used by any team member to mitigate the CustomersDot incident without involving an engineer from Fulfillment. Ideally, in addition to this checklist, we need to add a brief explanation of the following things:
- How we deploy CustomersDot: two projects: Ansible repo and the main repo with two pipelines for provisioning and deploying.
- CustomersDot infrastructure: where everything is located, how we monitor services, alerts, where to search for logs, etc...