Skip to content

Add a definition for all snowplow events emitted by GitLab Backend

Problem

There are events that are being send to our Snowplow collector but don't have an event definition. This makes these events essentially undocumented, which creates problems for event discovery as well as to estimate migration efforts and similar.

The "undocumented" events for the backend can be found via (based on the assumption that those don't emit a page_url_path):

SELECT event_action
FROM prod.common_mart.mart_behavior_structured_event
WHERE behavior_at > CURRENT_DATE - 7
AND event_action NOT IN (SELECT action FROM SREHM_PREP.PUBLIC.EVENT_DEFINITIONS)
AND page_url_path IS NULL
AND app_id = 'gitlab'
GROUP BY 1

NOTE: SREHM_PREP.PUBLIC.EVENT_DEFINITIONS is a manual upload of the CSV export from metrics.gitlab.com/events so it needs to be manually kept up-to-date otherwise the above query might include already defined events. It should be accessible by anyone with the SNOWFLAKE_ANALYST role.

Alternatively you can create your own version of the table by:

  1. go to metrics.gitlab.com/events and click the export button.
  2. In Snowflake navigate to Data > Add Data
  3. Select a warehouse and select [your_username]_prep Database
  4. Under File format click View options
  5. Under Header select First line contains header
  6. Then import the data and you can access it in your own table.

Desired Outcome

All events emitted by the GitLab backend are documented.

Potential Solution

  1. Build a Snowflake query that can check tier and/or identifiers data based on the event name

Then, for each event:

  1. Manually find the MR that introduced the event [we could try to automize it with a script, but it would still require manual confirmation]
  2. Based on the MR, write a description
  3. Write down the MR metadata into attributes like milestone, product_group [it's possible that this could be automatized]
  4. Fill out tier & identifiers based on the Snowflake data and/or the MR diff

Some of the events might require talking with the team that introduced them to fill out attributes like description, unclear product_groups etc

How to verify

Further actions needed

Edited by Michał Wielich