Skip to content

Geo: Add secondary git operations monthly unique users metrics

Aakriti Gupta requested to merge ag-add-sec-git-ops-event into master

What does this MR do and why?

This MR adds in a new internal event geo_secondary_git_op_action behind a feature flag track_geo_secondary_git_op_action. This event tracks git operations on Geo secondary sites.

These requests are proxied to the primary to make the geo sites available behind a single location aware URL. The primary detects git operation requests that originated on a secondary and track the internal event with user, project and namespace ids.

None of the requests originating on a primary are tracked.

For requests originating on a secondary

command tracks internal event?
Git push over http
git pull over http

git push over ssh
git pull over ssh

git push through web
tracks
does not track when request doesn’t go to primary

tracks
does not track when request doesn’t go to primary

not tested (I can test this as a follow-up to this MR; need to fix my Geo setup)*

Git pulls are not tracked because they don't get passed on to the primary unless the repo is outdated on the secondary but that doesn't result in loss of data, since we only need unique users doing git operations on secondaries. (Any user doing a git push is also doing git pulls.)

Related issue: #388119 (closed)

Screenshots or screen recordings

This is how the snowplow event should show up using snowplow-micro on gdk or GET:

+-------------------------------------------------------------+-----------------------------+-----------------------+---------------+---------------+
| count_distinct_user_id_from_geo_secondary_git_op_action_28d | geo_secondary_git_op_action | RedisHLLMetric        | 1             | 1             |
+-------------------------------------------------------------+-----------------------------+-----------------------+---------------+---------------+
+--------------------------------------------------------------------------------------------------------+
|                                            SNOWPLOW EVENTS                                             |
+-----------------------------+--------------------------+---------+--------------+------------+---------+
| Event Name                  | Collector Timestamp      | user_id | namespace_id | project_id | plan    |
+-----------------------------+--------------------------+---------+--------------+------------+---------+
| geo_secondary_git_op_action | 2023-11-30T11:08:18.510Z | 1       | 24           | 2          | default |
| geo_secondary_git_op_action | 2023-11-30T11:08:18.233Z | 1       | 24           | 2          | default |
+-----------------------------+--------------------------+---------+--------------+------------+---------+

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

Before After

How to set up and validate locally

  1. Setup a Geo primary and secondary (not on gdk, because you need different ssh ports to test for git ops over ssh. Setup with GET)
  2. Make sure snowplow is setup for the primary. Run a monitor for the new metric on primary with,
rails runner scripts/internal_events/monitor.rb "geo_secondary_git_op_action"

Here you will see new events when a git op is tracked.

  1. git clone a project from the secondary over ssh
  2. git push to this project. You should see the message,
This request to a Geo secondary node will be forwarded to the
remote: Geo primary node:
...
  1. Check if you see a new event (in red) on the monitor (script)
  2. Change the remote origin to the http address
  3. git push a new commit
  4. Check if you see a new event on the monitor (script)

Numbered steps to set up and validate the change are strongly suggested.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Aakriti Gupta

Merge request reports