Add Alert Escalation model & escalation service
What does this MR do?
DB Migration
This adds the AlertEscalation
(incident_management_alert_escalations
) table, as part of #323139 (closed).
incident_management_alert_escalations |
type | Null |
---|---|---|
id | bigint | not null |
policy_id | bigint | not null |
alert_id | bigint | not null |
last_notified_at | time with zone | not null |
created_at | time with zone | not null |
updated_at | time with zone | not null |
Database commands:
Up
rake db:migrate:up VERSION=20210524042404
== 20210524042404 CreateIncidentManagementEscalations: migrating ==============
-- create_table(:incident_management_alert_escalations)
-> 0.0186s
== 20210524042404 CreateIncidentManagementEscalations: migrated (0.0186s) =====
(10.5ms) CREATE TABLE "incident_management_alert_escalations" ("id" bigserial primary key, "policy_id" bigint NOT NULL, "alert_id" bigint NOT NULL, "created_at" timestamptz NOT NULL, "updated_at" timestamptz NOT NULL, CONSTRAINT "fk_rails_bc0826ee7d"
FOREIGN KEY ("policy_id")
REFERENCES "incident_management_escalation_policies" ("id")
ON DELETE CASCADE, CONSTRAINT "fk_rails_8d8de95da9"
FOREIGN KEY ("alert_id")
REFERENCES "alert_management_alerts" ("id")
ON DELETE CASCADE) /*application:web,line:/db/migrate/20210524042404_create_incident_management_escalations.rb:5:in `change'*/
↳ lib/gitlab/database.rb:377:in `block in transaction'
(1.3ms) CREATE INDEX "index_incident_management_alert_escalations_on_policy_id" ON "incident_management_alert_escalations" ("policy_id") /*application:web,line:/db/migrate/20210524042404_create_incident_management_escalations.rb:5:in `change'*/
↳ lib/gitlab/database.rb:377:in `block in transaction'
(1.0ms) CREATE INDEX "index_incident_management_alert_escalations_on_alert_id" ON "incident_management_alert_escalations" ("alert_id") /*application:web,line:/db/migrate/20210524042404_create_incident_management_escalations.rb:5:in `change'*/
↳ lib/gitlab/database.rb:377:in `block in transaction'
primary::SchemaMigration Create (1.0ms) INSERT INTO "schema_migrations" ("version") VALUES ('20210524042404') RETURNING "version" /*application:web,line:/lib/gitlab/database.rb:377:in `block in transaction'*/
Down
rake db:migrate:down VERSION=20210524042404
== 20210524042404 CreateIncidentManagementEscalations: reverting ==============
-- drop_table(:incident_management_alert_escalations)
-> 0.0070s
== 20210524042404 CreateIncidentManagementEscalations: reverted (0.0109s) =====
IncidentManagement::Escalations::ProcessService
It also adds logic around creating the AlertEscalations
when an Alert comes in and is processed.
Additionally, it introduces a service which will escalate the alert to the specified Oncall Schedule if the escalation alert has not met certain thresholds. This service will be used in a later MR when we introduce background jobs to check the state of the Escalations every minute. See #323139 (closed) for more info on this.
Testing
How to test locally:
- Enable the feature flag
Feature.enable(:escalation_policies_mvc)
& ensure your project is on Premium or above plan. - Set up an on-call schedule by following https://docs.gitlab.com/ee/operations/incident_management/oncall_schedules.html#schedules
- Create an alert by following the instructions at https://docs.gitlab.com/ee/operations/incident_management/integrations.html#customize-the-alert-payload-outside-of-gitlab
- Manually run the escalation processing logic
::IncidentManagement::Escalations::ProcessService.new(escalation).execute
. This will send an email to the user who is on call. - You can re-run
::IncidentManagement::Escalations::ProcessService.new(escalation).execute
in between rules and it should not send another email until the rule threshold is met.
Screenshots (strongly suggested)
Does this MR meet the acceptance criteria?
Conformity
-
I have included a changelog entry, or it's not needed. (Does this MR need a changelog?) -
I have added/updated documentation, or it's not needed. (Is documentation required?) -
I have properly separated EE content from FOSS, or this MR is FOSS only. (Where should EE code go?) -
I have added information for database reviewers in the MR description, or it's not needed. (Does this MR have database related changes?) -
I have self-reviewed this MR per code review guidelines. -
This MR does not harm performance, or I have asked a reviewer to help assess the performance impact. (Merge request performance guidelines) -
I have followed the style guides.
Availability and Testing
-
I have added/updated tests following the Testing Guide, or it's not needed. (Consider all test levels. See the Test Planning Process.) -
I have tested this MR in all supported browsers, or it's not needed. -
I have informed the Infrastructure department of a default or new setting change per definition of done, or it's not needed.
Security
Does this MR contain changes to processing or storing of credentials or tokens, authorization and authentication methods or other items described in the security review guidelines? If not, then delete this Security section.
-
Label as security and @ mention @gitlab-com/gl-security/appsec
-
The MR includes necessary changes to maintain consistency between UI, API, email, or other methods -
Security reports checked/validated by a reviewer from the AppSec team
Related to #323139 (closed)