Instrument Monitor GMAU via aggregate counter in usage ping

Overview

Define and instrument Monitor:Health GMAU which will be based on Incident Management. The proposal and content of this issue was copied from #229918 (closed), which is the strategy of the Plan team for project management.

GMAU definition for Incident management

The sum of all unique users that took an action on an Alert or Incident over a given period of time.

Goal

Track user interactions with alerts and incidents within the usage ping so that we can calculate Monthly Active Users (MAU) for Incident Management which will be the GMAU for Monitor:Health. It is important that we have an accurate reflect of MAU as it is one of the primary metrics that GitLab leadership uses to make informed investment decisions across the product.

Proposal

  1. Implement a single counter in the usage ping that aggregates the total count of unique users that took an action (listed below) on an alert or incident.
  2. (might be reprioritized for later) Instrument usage ping counters for all actions individually so that we can understand each action individually.

Record the aggregate unique MAU counts under incident_management_events in the usage ping.

Why are we using a single counter to define GMAU?

For now, we cannot tell unique users across multiple actions if each one is reported in an individual Usage Ping metric. With Usage Ping we are only getting broad signal for "are users interacting", and so the more bundled our counter, the better.

Events To Aggregate In Usage Ping

Action Description Added To Counter
alert_status_change Changed status of alert
alert_assigned Changed assignee on alerts
alert_todo Added to-do on alert
incident_created Incident created
incident_reopened Reopened incident
incident_closed Closed Incident
incident_assigned Changed assignee on incidents
incident_todo Added to-do on Incidents
incident_comment Comment added to incident
incident_zoom_meeting Zoom meeting associated with incident
incident_published Incident published to status page
incident_relate Marked Incident as related to another
incident_unrelate Removed relation on incident
incident_change_confidential Marked the incident as confidential or non-confidential

Identifiers

Totals are unique users over all the events tracked in the same category, in this case incident_management. Totals metrics have format #{category}_total_unique_counts_weekly and #{category}_total_unique_counts_monthly

For Incident management, the identifiers are as follows:

  • Weekly: incident_management_total_unique_counts_weekly
  • Monthly: incident_management_total_unique_counts_monthly
Edited by Sarah Waldner