Alert on failed deployments
Release notes
Problem to solve
Developers and release managers want to know when deployments fail so that they can address underlying issues that caused the failure and right them. Sometimes deployments take an extended amount of time and responsible parties are not actively watching the pipelines. Alerts help users move onto other tasks with the guarantee that they will be notified immediately if there is something that needs to be addressed.
How are users notified of failed deployments and other pipeline events
- Many users have a deployment event and a webhook.
- Email notifications - failed job in a pipeline will send a notification via email
- Failed deployments can be seen on the jobs page - this is a manual check
- Slack notifications service can also be used to send events to Slack for pipeline failures and deployment events
Proposal
Enable alerting on failed pipeline jobs.
When pipeline jobs fail today they create to-dos. We will need to be careful of creating duplicate todos when we add alerting to this.
Intended users
- Devon (DevOps Engineer)
- Rachel (Release Manager)
- Allison (Application Ops)
- Priyanka (Platform Engineer)
User experience goal
Give people a place to "opt-in" to receiving alerts for failed deployments.
Proposal
Allow users to generate alerts in GitLab for failed deployments. Surface the alert in the alerts list.
Design
People can opt-in to this feature on the Settings > Repository page:
Following the existing pattern on this page, the question mark can link to the docs for this feature.
We can also consider promoting this new feature on the alert/alert settings page, to clarify that this setting needs to be enabled elsewhere:
On operations settings | On alert list |
---|---|
Note: if we add these tip alerts, we should verify the text with TW.
Further Details
This work supports the Incident Management direction.