Skip to content

Add Northstar metric Monitor: Health

Allison Browne requested to merge 213894-northstar-metric-monitor-health into master

What does this MR do?

Documents that incident_issues represents the number of issues created by the alert bot.

Adds a new metric alert_bot_incident_issues which is identical to incident_issues and represents the number of issues created by the alert bot.


Adds incident_labeled_issues to usage ping.

This is the northstar metric - count of all issues being used as incidents from all sources. It does overlap with the first metric that is only counting issues that have been auto-created.

see explains below for the queries generated.

  -----
  
     (1.0ms)  SELECT MIN("issues"."id") FROM "issues" INNER JOIN "label_links" ON "label_links"."target_id" = "issues"."id" AND "label_links"."target_type" = 'Issue' INNER JOIN "labels" ON "labels"."id" = "label_links"."label_id" WHERE "labels"."title" = 'incident' AND "labels"."color" = '#CC0033' AND "labels"."description" = 'Denotes a disruption to IT services and the associated issues require immediate attention'
     
     
  Time: 29.838 ms
  - planning: 1.141 ms
  - execution: 28.697 ms
    - I/O read: 0.000 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 25047 (~195.70 MiB) from the buffer pool
  - reads: 0 from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0
  
  
   Aggregate  (cost=201.91..201.92 rows=1 width=4) (actual time=28.633..28.633 rows=1 loops=1)
   Buffers: shared hit=25047
   ->  Nested Loop  (cost=1.56..201.90 rows=2 width=4) (actual time=0.080..27.967 rows=4460 loops=1)
         Buffers: shared hit=25047
         ->  Nested Loop  (cost=1.00..200.70 rows=2 width=4) (actual time=0.059..8.516 rows=4460 loops=1)
               Buffers: shared hit=4465
               ->  Index Scan using index_labels_on_title on public.labels  (cost=0.43..99.51 rows=1 width=4) (actual time=0.036..0.210 rows=32 loops=1)
                     Index Cond: ((labels.title)::text = 'incident'::text)
                     Filter: (((labels.color)::text = '#CC0033'::text) AND ((labels.description)::text = 'Denotes a disruption to IT services and the associated issues require immediate attention'::text))
                     Rows Removed by Filter: 67
                     Buffers: shared hit=102
               ->  Index Scan using index_label_links_on_label_id on public.label_links  (cost=0.56..100.28 rows=91 width=8) (actual time=0.011..0.238 rows=139 loops=32)
                     Index Cond: (label_links.label_id = labels.id)
                     Filter: ((label_links.target_type)::text = 'Issue'::text)
                     Rows Removed by Filter: 0
                     Buffers: shared hit=4363
         ->  Index Only Scan using issues_pkey on public.issues  (cost=0.56..0.59 rows=1 width=4) (actual time=0.004..0.004 rows=1 loops=4460)
               Index Cond: (issues.id = label_links.target_id)
               Heap Fetches: 0
               Buffers: shared hit=20582
     
     
  ----
  
   (1.0ms)  SELECT MAX("issues"."id") FROM "issues" INNER JOIN "label_links" ON "label_links"."target_id" = "issues"."id" AND "label_links"."target_type" = 'Issue' INNER JOIN "labels" ON "labels"."id" = "label_links"."label_id" WHERE "labels"."title" = 'incident' AND "labels"."color" = '#CC0033' AND "labels"."description" = 'Denotes a disruption to IT services and the associated issues require immediate attention'
   
   
   Time: 28.626 ms
  - planning: 0.991 ms
  - execution: 27.635 ms
    - I/O read: 0.000 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 25047 (~195.70 MiB) from the buffer pool
  - reads: 0 from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0
  
  
   Aggregate  (cost=201.91..201.92 rows=1 width=4) (actual time=27.560..27.560 rows=1 loops=1)
   Buffers: shared hit=25047
   ->  Nested Loop  (cost=1.56..201.90 rows=2 width=4) (actual time=0.122..26.896 rows=4460 loops=1)
         Buffers: shared hit=25047
         ->  Nested Loop  (cost=1.00..200.70 rows=2 width=4) (actual time=0.111..8.388 rows=4460 loops=1)
               Buffers: shared hit=4465
               ->  Index Scan using index_labels_on_title on public.labels  (cost=0.43..99.51 rows=1 width=4) (actual time=0.096..0.458 rows=32 loops=1)
                     Index Cond: ((labels.title)::text = 'incident'::text)
                     Filter: (((labels.color)::text = '#CC0033'::text) AND ((labels.description)::text = 'Denotes a disruption to IT services and the associated issues require immediate attention'::text))
                     Rows Removed by Filter: 67
                     Buffers: shared hit=102
               ->  Index Scan using index_label_links_on_label_id on public.label_links  (cost=0.56..100.28 rows=91 width=8) (actual time=0.008..0.226 rows=139 loops=32)
                     Index Cond: (label_links.label_id = labels.id)
                     Filter: ((label_links.target_type)::text = 'Issue'::text)
                     Rows Removed by Filter: 0
                     Buffers: shared hit=4363
         ->  Index Only Scan using issues_pkey on public.issues  (cost=0.56..0.59 rows=1 width=4) (actual time=0.004..0.004 rows=1 loops=4460)
               Index Cond: (issues.id = label_links.target_id)
               Heap Fetches: 0
               Buffers: shared hit=20582

   
   ----
   
   (0.9ms)  SELECT COUNT("issues"."id") FROM "issues" INNER JOIN "label_links" ON "label_links"."target_id" = "issues"."id" AND "label_links"."target_type" = 'Issue' INNER JOIN "labels" ON "labels"."id" = "label_links"."label_id" WHERE "labels"."title" = 'incident' AND "labels"."color" = '#CC0033' AND "labels"."description" = 'Denotes a disruption to IT services and the associated issues require immediate attention' AND "issues"."id" BETWEEN 0 AND 99999
   
   Time: 9.167 ms
  - planning: 1.003 ms
  - execution: 8.164 ms
    - I/O read: 0.000 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 4465 (~34.90 MiB) from the buffer pool
  - reads: 0 from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0
  
  
   Aggregate  (cost=201.92..201.93 rows=1 width=8) (actual time=8.091..8.091 rows=1 loops=1)
   Buffers: shared hit=4465
   ->  Nested Loop  (cost=1.56..201.91 rows=1 width=4) (actual time=8.088..8.089 rows=0 loops=1)
         Buffers: shared hit=4465
         ->  Nested Loop  (cost=1.00..200.70 rows=2 width=4) (actual time=0.041..6.229 rows=4460 loops=1)
               Buffers: shared hit=4465
               ->  Index Scan using index_labels_on_title on public.labels  (cost=0.43..99.51 rows=1 width=4) (actual time=0.029..0.159 rows=32 loops=1)
                     Index Cond: ((labels.title)::text = 'incident'::text)
                     Filter: (((labels.color)::text = '#CC0033'::text) AND ((labels.description)::text = 'Denotes a disruption to IT services and the associated issues require immediate attention'::text))
                     Rows Removed by Filter: 67
                     Buffers: shared hit=102
               ->  Index Scan using index_label_links_on_label_id on public.label_links  (cost=0.56..100.28 rows=91 width=8) (actual time=0.006..0.168 rows=139 loops=32)
                     Index Cond: (label_links.label_id = labels.id)
                     Filter: ((label_links.target_type)::text = 'Issue'::text)
                     Rows Removed by Filter: 0
                     Buffers: shared hit=4363
         ->  Index Only Scan using issues_pkey on public.issues  (cost=0.56..0.60 rows=1 width=4) (actual time=0.000..0.000 rows=0 loops=4460)
               Index Cond: ((issues.id = label_links.target_id) AND (issues.id >= 0) AND (issues.id <= 99999))
               Heap Fetches: 0

The 3rd metric mentioned in the issue will be in a separate MR

Screenshots

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • [-] Label as security and @ mention @gitlab-com/gl-security/appsec
  • [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • [-] Security reports checked/validated by a reviewer from the AppSec team

Part of #213894 (closed)

Edited by Rémy Coutable

Merge request reports