Status page lists recent incidents
Problem to solve
A public status page is an efficient way to communicate with stakeholders on incidents and outages that are impacting services they depend upon. These stakeholders may be executive leadership at the company internally or it's customers. If there are multiple incidents occurring at the same time, a list of on-going or recently closed incidents/outages provides stakeholders with a 30,000 ft view into the state of services with the option to drill down into particular problems.
Intended users
Status page has two unique sets of intended users. The first group is response engineers using GitLab to manage incidents that want to provide incident status.
- Delaney (Development Team Lead)
- Sasha (Software Developer)
- Devon (DevOps Engineer)
- Sidney (Systems Administrator)
The second group is the vast array of stakeholders who want to consume the incident status. This can include everyone from business executives to customers.
Further details
This work supports the Status Page vision.
Proposal
- When an incident is created, GitLab automatically adds that incident to a list on the main Status page.
- The page lists the date of the incident, the incident title and status (open/closed)
Design
The MVC design for the incident list page is as follows:
Note that this design also includes a button linking the incident summary to the detail page. We won't have anything for these buttons to link to until #205165 (closed) is completed. Depending on how development goes, they can be removed from this issue and added to #205579 (closed) if needed.
Out of scope
-
Full report
buttons may or may not be included depending on the status of #205165 (closed) -
Subscribe to RSS
buttons may or may not be included depending on the status of #205579 (closed)
Technical details
TBD
Permissions and Security
Documentation
Documentation is required. We will need to create a brand new section in the GitLab docs and call it Status Page.
Availability & Testing
A new end to end test will be introduced to cover the functionality of this feature, please follow up in E2E test for Status Page with Incidents
What does success look like, and how can we measure that?
Links / references
Full set of mocks for the entire workflow is on Sketch cloud
Full user flow is also visible on Mural