Geo: Add secondary git operations monthly unique users metrics
What does this MR do and why?
This MR adds in a new internal event geo_secondary_git_op_action
behind a feature flag track_geo_secondary_git_op_action
.
This event tracks git operations on Geo secondary sites.
These requests are proxied to the primary to make the geo sites available behind a single location aware URL. The primary detects git operation requests that originated on a secondary and track the internal event with user, project and namespace ids.
None of the requests originating on a primary are tracked.
For requests originating on a secondary
command | tracks internal event? |
---|---|
Git push over http git pull over http git push over ssh git pull over ssh git push through web |
tracks does not track when request doesn’t go to primary tracks does not track when request doesn’t go to primary not tested (I can test this as a follow-up to this MR; need to fix my Geo setup)* |
Git pulls are not tracked because they don't get passed on to the primary unless the repo is outdated on the secondary but that doesn't result in loss of data, since we only need unique users doing git operations on secondaries. (Any user doing a git push is also doing git pulls.)
Related issue: #388119 (closed)
Screenshots or screen recordings
This is how the snowplow event should show up using snowplow-micro on gdk or GET:
+-------------------------------------------------------------+-----------------------------+-----------------------+---------------+---------------+
| count_distinct_user_id_from_geo_secondary_git_op_action_28d | geo_secondary_git_op_action | RedisHLLMetric | 1 | 1 |
+-------------------------------------------------------------+-----------------------------+-----------------------+---------------+---------------+
+--------------------------------------------------------------------------------------------------------+
| SNOWPLOW EVENTS |
+-----------------------------+--------------------------+---------+--------------+------------+---------+
| Event Name | Collector Timestamp | user_id | namespace_id | project_id | plan |
+-----------------------------+--------------------------+---------+--------------+------------+---------+
| geo_secondary_git_op_action | 2023-11-30T11:08:18.510Z | 1 | 24 | 2 | default |
| geo_secondary_git_op_action | 2023-11-30T11:08:18.233Z | 1 | 24 | 2 | default |
+-----------------------------+--------------------------+---------+--------------+------------+---------+
Screenshots are required for UI changes, and strongly recommended for all other merge requests.
Before | After |
---|---|
How to set up and validate locally
- Setup a Geo primary and secondary (not on gdk, because you need different ssh ports to test for git ops over ssh. Setup with GET)
- Make sure snowplow is setup for the primary. Run a monitor for the new metric on primary with,
rails runner scripts/internal_events/monitor.rb "geo_secondary_git_op_action"
Here you will see new events when a git op is tracked.
- git clone a project from the secondary over ssh
- git push to this project. You should see the message,
This request to a Geo secondary node will be forwarded to the
remote: Geo primary node:
...
- Check if you see a new event (in red) on the monitor (script)
- Change the remote origin to the http address
- git push a new commit
- Check if you see a new event on the monitor (script)
Numbered steps to set up and validate the change are strongly suggested.
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.