Skip to content

PoC: Introduce a processor to create incidents from failing pipelines

What does this MR do and why?

This MR currently does 3 things:

  1. Support pipeline events: !2116 (merged)
  2. Introduce classes to create incidents from pipeline events
  3. Introduce a processor to create incidents from failing pipelines

The goal is to replace the jobs and scripts from the main project, since "broken master" managements shouldn't be the responsibility of the main project, but is an Ops thing instead.

This effectively replaces:

Still to do

  • Move the two first two commits to separate MRs.
  • Handle the Slack notification after creating the incident
  • Handle ruby2 branch use-case
  • Allow to post to Slack without creating incident (currently done for stable and ruby2 branches

Expected impact & dry-runs

These are strongly recommended to assist reviewers and reduce the time to merge your change.

See https://gitlab.com/gitlab-org/quality/triage-ops/-/tree/master/doc/scheduled#testing-with-a-dry-run on how to perform dry-runs for new policies.

See https://gitlab.com/gitlab-org/quality/triage-ops/-/blob/master/doc/reactive/best_practices.md#use-the-sandbox-to-test-new-processors on how to make sure a new processor can be tested.

Action items

Edited by Rémy Coutable

Merge request reports