Add new parallel snowplow destination to support database events

Background

Data team is planning to set up and utilise their own snowplow event collection pipeline to track every interaction with gitlab.com database. That means that GitLab system needs to reports into two snowplow collection endpoints

Goals

  1. Modify existing event tracking library to be able to report int two collectors. Gitlab::Tracking#event should be able to select snowplow destionation in the runtime, with the default behaviour being report to Product Intelligence tracking pipeline. Alternatively there might be separate method created in Gitlab::Tracking that would report to new endpoint.
  2. Replace Gitlat::Tracking#track call at https://gitlab.com/gitlab-org/gitlab/-/blob/6aa3b620a8214f733f3d0acd9bd86384b00d9f84/app/models/concerns/database_event_tracking.rb#L33 with new method from point 1

Implementation tips

Following diff demonstrates PoC changes that was used to check if Snowplow can report into two endpoints with out an issue. Some bits of this code might be reused for sake of this issue

diff --git a/lib/gitlab/tracking.rb b/lib/gitlab/tracking.rb
index 45f836f10d3a..ed65595bfca2 100644
--- a/lib/gitlab/tracking.rb
+++ b/lib/gitlab/tracking.rb
@@ -12,7 +12,7 @@ def event(category, action, label: nil, property: nil, value: nil, context: [],

         action = action.to_s

-        tracker.event(category, action, label: label, property: property, value: value, context: contexts)
+        trackers.each { |t| t.event(category, action, label: label, property: property, value: value, context: contexts) }
       rescue StandardError => error
         Gitlab::ErrorTracking.track_and_raise_for_dev_exception(error, snowplow_category: category, snowplow_action: action)
       end
@@ -55,6 +55,10 @@ def tracker
                        Gitlab::Tracking::Destinations::Snowplow.new
                      end
       end
+
+      def trackers
+        @trackers ||= [Gitlab::Tracking::Destinations::SnowplowMicro.new, Gitlab::Tracking::Destinations::Snowplow.new]
+      end
     end
   end
 end
diff --git a/lib/gitlab/tracking/destinations/snowplow.rb b/lib/gitlab/tracking/destinations/snowplow.rb
index fd877bc01378..a75b98e914e9 100644
--- a/lib/gitlab/tracking/destinations/snowplow.rb
+++ b/lib/gitlab/tracking/destinations/snowplow.rb
@@ -40,10 +40,12 @@ def options(group)

         def enabled?
           Gitlab::CurrentSettings.snowplow_enabled?
+          true
         end

         def hostname
           Gitlab::CurrentSettings.snowplow_collector_hostname
+          "webhook.site/5c4e5edc-d948-4a08-81ec-1e66fe6a7621"
         end

         private
@@ -60,11 +62,15 @@ def cookie_domain
           Gitlab::CurrentSettings.snowplow_cookie_domain
         end

+        def snowplow_namespace
+          SNOWPLOW_NAMESPACE
+        end
+
         def tracker
           @tracker ||= SnowplowTracker::Tracker.new(
             emitters: [emitter],
             subject: SnowplowTracker::Subject.new,
-            namespace: SNOWPLOW_NAMESPACE,
+            namespace: snowplow_namespace,
             app_id: app_id
           )
         end
diff --git a/lib/gitlab/tracking/destinations/snowplow_micro.rb b/lib/gitlab/tracking/destinations/snowplow_micro.rb
index 09480f261064..049ab16685bf 100644
--- a/lib/gitlab/tracking/destinations/snowplow_micro.rb
+++ b/lib/gitlab/tracking/destinations/snowplow_micro.rb
@@ -6,8 +6,9 @@ module Destinations
       class SnowplowMicro < Snowplow
         include ::Gitlab::Utils::StrongMemoize
         extend ::Gitlab::Utils::Override
+        SNOWPLOW_NAMESPACE = 'gl_mic'

-        DEFAULT_URI = 'http://localhost:9090'
+        DEFAULT_URI = "https://webhook.site/2776a6ae-38a4-46b1-90e4-bed6a6d9a0bb" #'http://localhost:9090'

         override :options
         def options(group)
@@ -25,7 +26,11 @@ def enabled?

         override :hostname
         def hostname
-          "#{uri.host}:#{uri.port}"
+          "webhook.site/2776a6ae-38a4-46b1-90e4-bed6a6d9a0bb"
+        end
+
+        def snowplow_namespace
+          SNOWPLOW_NAMESPACE
         end

         def uri
@@ -53,6 +58,7 @@ def base_uri
           url = Gitlab.config.snowplow_micro.address
           scheme = Gitlab.config.gitlab.https ? 'https' : 'http'
           "#{scheme}://#{url}"
+          DEFAULT_URI
         rescue Settingslogic::MissingSetting
           DEFAULT_URI
         end
(END)
Edited by Mikołaj Wawrzyniak