Skip to content

Allow saving of runbooks to PrometheusAlert

Sarah Yasonik requested to merge sy-alert-runbook-ui-be into master

What does this MR do?

Adds a field runbook for PrometheusAlert. This will allow users to include a link to a runbook in the payload of an Prometheus alerting rule, so that a firing alert will come with instructions on how to handle the problem.

Related issue: #208490 (closed)

----- Click here for supplemental info for database review ------

Up migrations

$ rails db:migrate
== 20200729191227 AddRunbookToPrometheusAlert: migrating ======================
-- add_column(:prometheus_alerts, :runbook_url, :text)
   -> 0.0013s
== 20200729191227 AddRunbookToPrometheusAlert: migrated (0.0013s) =============

== 20200729200808 AddTextLimitToRunbookOnPrometheusAlerts: migrating ==========
-- transaction_open?()
   -> 0.0000s
-- execute("ALTER TABLE prometheus_alerts\nADD CONSTRAINT check_cb76d7e629\nCHECK ( char_length(runbook_url) <= 255 )\nNOT VALID;\n")
   -> 0.0009s
-- execute("ALTER TABLE prometheus_alerts VALIDATE CONSTRAINT check_cb76d7e629;")
   -> 0.0018s
== 20200729200808 AddTextLimitToRunbookOnPrometheusAlerts: migrated (0.0142s) =
$ bin/rails dbconsole
psql (11.7)
Type "help" for help.

gitlabhq_development=# \d prometheus_alerts
                                            Table "public.prometheus_alerts"
        Column        |           Type           | Collation | Nullable |                    Default                    
----------------------+--------------------------+-----------+----------+-----------------------------------------------
 id                   | integer                  |           | not null | nextval('prometheus_alerts_id_seq'::regclass)
 created_at           | timestamp with time zone |           | not null | 
 updated_at           | timestamp with time zone |           | not null | 
 threshold            | double precision         |           | not null | 
 operator             | integer                  |           | not null | 
 environment_id       | integer                  |           | not null | 
 project_id           | integer                  |           | not null | 
 prometheus_metric_id | integer                  |           | not null | 
 runbook_url          | text                     |           |          | 
Indexes:
    "prometheus_alerts_pkey" PRIMARY KEY, btree (id)
    "index_prometheus_alerts_metric_environment" UNIQUE, btree (project_id, prometheus_metric_id, environment_id)
    "index_prometheus_alerts_on_environment_id" btree (environment_id)
    "index_prometheus_alerts_on_prometheus_metric_id" btree (prometheus_metric_id)
Check constraints:
    "check_cb76d7e629" CHECK (char_length(runbook_url) <= 255)
Foreign-key constraints:
    "fk_rails_6d9b283465" FOREIGN KEY (environment_id) REFERENCES environments(id) ON DELETE CASCADE
    "fk_rails_e6351447ec" FOREIGN KEY (prometheus_metric_id) REFERENCES prometheus_metrics(id) ON DELETE CASCADE
    "fk_rails_f0e8db86aa" FOREIGN KEY (project_id) REFERENCES projects(id) ON DELETE CASCADE
Referenced by:
    TABLE "alert_management_alerts" CONSTRAINT "fk_51ab4b6089" FOREIGN KEY (prometheus_alert_id) REFERENCES prometheus_alerts(id) ON DELETE CASCADE
    TABLE "prometheus_alert_events" CONSTRAINT "fk_rails_106f901176" FOREIGN KEY (prometheus_alert_id) REFERENCES prometheus_alerts(id) ON DELETE CASCADE

Down: AddTextLimitToRunbookOnPrometheusAlerts

$ rails db:migrate:down VERSION=20200729200808
== 20200729200808 AddTextLimitToRunbookOnPrometheusAlerts: reverting ==========
-- execute("ALTER TABLE prometheus_alerts\nDROP CONSTRAINT IF EXISTS check_cb76d7e629\n")
   -> 0.0008s
== 20200729200808 AddTextLimitToRunbookOnPrometheusAlerts: reverted (0.0053s) =
$ bin/rails dbconsole
psql (11.7)
Type "help" for help.

gitlabhq_development=# \d prometheus_alerts
                                            Table "public.prometheus_alerts"
        Column        |           Type           | Collation | Nullable |                    Default                    
----------------------+--------------------------+-----------+----------+-----------------------------------------------
 id                   | integer                  |           | not null | nextval('prometheus_alerts_id_seq'::regclass)
 created_at           | timestamp with time zone |           | not null | 
 updated_at           | timestamp with time zone |           | not null | 
 threshold            | double precision         |           | not null | 
 operator             | integer                  |           | not null | 
 environment_id       | integer                  |           | not null | 
 project_id           | integer                  |           | not null | 
 prometheus_metric_id | integer                  |           | not null | 
 runbook_url          | text                     |           |          | 
Indexes:
    "prometheus_alerts_pkey" PRIMARY KEY, btree (id)
    "index_prometheus_alerts_metric_environment" UNIQUE, btree (project_id, prometheus_metric_id, environment_id)
    "index_prometheus_alerts_on_environment_id" btree (environment_id)
    "index_prometheus_alerts_on_prometheus_metric_id" btree (prometheus_metric_id)
Foreign-key constraints:
    "fk_rails_6d9b283465" FOREIGN KEY (environment_id) REFERENCES environments(id) ON DELETE CASCADE
    "fk_rails_e6351447ec" FOREIGN KEY (prometheus_metric_id) REFERENCES prometheus_metrics(id) ON DELETE CASCADE
    "fk_rails_f0e8db86aa" FOREIGN KEY (project_id) REFERENCES projects(id) ON DELETE CASCADE
Referenced by:
    TABLE "alert_management_alerts" CONSTRAINT "fk_51ab4b6089" FOREIGN KEY (prometheus_alert_id) REFERENCES prometheus_alerts(id) ON DELETE CASCADE
    TABLE "prometheus_alert_events" CONSTRAINT "fk_rails_106f901176" FOREIGN KEY (prometheus_alert_id) REFERENCES prometheus_alerts(id) ON DELETE CASCADE

Down: AddRunbookToPrometheusAlert

$ rails db:migrate:down VERSION=20200729191227
== 20200729191227 AddRunbookToPrometheusAlert: reverting ======================
-- remove_column(:prometheus_alerts, :runbook_url)
   -> 0.0010s
== 20200729191227 AddRunbookToPrometheusAlert: reverted (0.0010s) =============
$ bin/rails dbconsole
psql (11.7)
Type "help" for help.

gitlabhq_development=# \d prometheus_alerts
                                            Table "public.prometheus_alerts"
        Column        |           Type           | Collation | Nullable |                    Default                    
----------------------+--------------------------+-----------+----------+-----------------------------------------------
 id                   | integer                  |           | not null | nextval('prometheus_alerts_id_seq'::regclass)
 created_at           | timestamp with time zone |           | not null | 
 updated_at           | timestamp with time zone |           | not null | 
 threshold            | double precision         |           | not null | 
 operator             | integer                  |           | not null | 
 environment_id       | integer                  |           | not null | 
 project_id           | integer                  |           | not null | 
 prometheus_metric_id | integer                  |           | not null | 
Indexes:
    "prometheus_alerts_pkey" PRIMARY KEY, btree (id)
    "index_prometheus_alerts_metric_environment" UNIQUE, btree (project_id, prometheus_metric_id, environment_id)
    "index_prometheus_alerts_on_environment_id" btree (environment_id)
    "index_prometheus_alerts_on_prometheus_metric_id" btree (prometheus_metric_id)
Foreign-key constraints:
    "fk_rails_6d9b283465" FOREIGN KEY (environment_id) REFERENCES environments(id) ON DELETE CASCADE
    "fk_rails_e6351447ec" FOREIGN KEY (prometheus_metric_id) REFERENCES prometheus_metrics(id) ON DELETE CASCADE
    "fk_rails_f0e8db86aa" FOREIGN KEY (project_id) REFERENCES projects(id) ON DELETE CASCADE
Referenced by:
    TABLE "alert_management_alerts" CONSTRAINT "fk_51ab4b6089" FOREIGN KEY (prometheus_alert_id) REFERENCES prometheus_alerts(id) ON DELETE CASCADE
    TABLE "prometheus_alert_events" CONSTRAINT "fk_rails_106f901176" FOREIGN KEY (prometheus_alert_id) REFERENCES prometheus_alerts(id) ON DELETE CASCADE

Screenshots

Alerting rule set from dashboard Firing alert payload
Screen_Shot_2020-08-05_at_12.40.38_PM Screen_Shot_2020-08-05_at_12.40.31_PM

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by Sarah Yasonik

Merge request reports