Add a secondary snowplow collector uri option and send events automatically to https://snowplow.trx.gitlab.net

Description

Snowplow was integrated into Gitlab for tracking pageviews and events. As I pointed out in https://gitlab.com/gitlab-org/gitlab-ee/issues/6336#note_92497355 this could bring a lot of product usage information about Gitlab ee users so it would be awesome to collect this information on Gitlab's side as well. For me, it's something similar if a software saying Allow sending anonymous data of a software's usage.

Proposal

Add a second snowplow_collector_uri to configuration options and it's pointing to Gitlab's snowplow collector as default.

As I see in https://gitlab.com/gitlab-org/gitlab-ee/issues/6336 this collector endpoint should be https://snowplow.trx.gitlab.net

Regarding the Snowplow docs, by default, any Snowplow method you call will be executed by every tracker you have created so far. Seems if we add the secondary collector, in this case, all the events will be sent to it automatically. Also, there is an option to send events to a specific collector so this could be useful in case of privacy issues at some point.

Pitfalls

  • This could generate a huge network traffic for users if in the future a lot of additional tracking will be added to the product.
  • Privacy issues

Links / references

  • Meta issue for setting up Snowplow on GitLab.com: https://gitlab.com/gitlab-org/gitlab-ee/issues/6329
Edited Sep 03, 2018 by Tamas Szuromi
Assignee Loading
Time tracking Loading