Skip to content

Add Alert Escalation model & escalation service

Sean Arnold requested to merge 323139-create-escalations-table into master

What does this MR do?

DB Migration

This adds the AlertEscalation(incident_management_alert_escalations) table, as part of #323139 (closed).

incident_management_alert_escalations type Null
id bigint not null
policy_id bigint not null
alert_id bigint not null
last_notified_at time with zone not null
created_at time with zone not null
updated_at time with zone not null

Database commands:

Up
rake db:migrate:up VERSION=20210524042404
== 20210524042404 CreateIncidentManagementEscalations: migrating ==============
-- create_table(:incident_management_alert_escalations)
   -> 0.0186s
== 20210524042404 CreateIncidentManagementEscalations: migrated (0.0186s) =====

(10.5ms)  CREATE TABLE "incident_management_alert_escalations" ("id" bigserial primary key, "policy_id" bigint NOT NULL, "alert_id" bigint NOT NULL, "created_at" timestamptz NOT NULL, "updated_at" timestamptz NOT NULL, CONSTRAINT "fk_rails_bc0826ee7d"
FOREIGN KEY ("policy_id")
  REFERENCES "incident_management_escalation_policies" ("id")
 ON DELETE CASCADE, CONSTRAINT "fk_rails_8d8de95da9"
FOREIGN KEY ("alert_id")
  REFERENCES "alert_management_alerts" ("id")
 ON DELETE CASCADE) /*application:web,line:/db/migrate/20210524042404_create_incident_management_escalations.rb:5:in `change'*/
   lib/gitlab/database.rb:377:in `block in transaction'
   (1.3ms)  CREATE  INDEX  "index_incident_management_alert_escalations_on_policy_id" ON "incident_management_alert_escalations"  ("policy_id") /*application:web,line:/db/migrate/20210524042404_create_incident_management_escalations.rb:5:in `change'*/
  ↳ lib/gitlab/database.rb:377:in `block in transaction'
   (1.0ms)  CREATE  INDEX  "index_incident_management_alert_escalations_on_alert_id" ON "incident_management_alert_escalations"  ("alert_id") /*application:web,line:/db/migrate/20210524042404_create_incident_management_escalations.rb:5:in `change'*/
   lib/gitlab/database.rb:377:in `block in transaction'
  primary::SchemaMigration Create (1.0ms)  INSERT INTO "schema_migrations" ("version") VALUES ('20210524042404') RETURNING "version" /*application:web,line:/lib/gitlab/database.rb:377:in `block in transaction'*/
Down
rake db:migrate:down VERSION=20210524042404
== 20210524042404 CreateIncidentManagementEscalations: reverting ==============
-- drop_table(:incident_management_alert_escalations)
   -> 0.0070s
== 20210524042404 CreateIncidentManagementEscalations: reverted (0.0109s) =====

IncidentManagement::Escalations::ProcessService

It also adds logic around creating the AlertEscalations when an Alert comes in and is processed. Additionally, it introduces a service which will escalate the alert to the specified Oncall Schedule if the escalation alert has not met certain thresholds. This service will be used in a later MR when we introduce background jobs to check the state of the Escalations every minute. See #323139 (closed) for more info on this.

Testing

How to test locally:

  1. Enable the feature flag Feature.enable(:escalation_policies_mvc) & ensure your project is on Premium or above plan.
  2. Set up an on-call schedule by following https://docs.gitlab.com/ee/operations/incident_management/oncall_schedules.html#schedules
  3. Create an alert by following the instructions at https://docs.gitlab.com/ee/operations/incident_management/integrations.html#customize-the-alert-payload-outside-of-gitlab
  4. Manually run the escalation processing logic ::IncidentManagement::Escalations::ProcessService.new(escalation).execute. This will send an email to the user who is on call.
  5. You can re-run ::IncidentManagement::Escalations::ProcessService.new(escalation).execute in between rules and it should not send another email until the rule threshold is met.

Screenshots (strongly suggested)

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

Does this MR contain changes to processing or storing of credentials or tokens, authorization and authentication methods or other items described in the security review guidelines? If not, then delete this Security section.

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team

Related to #323139 (closed)

Edited by Sean Arnold

Merge request reports