Add support for querying unstructured event data in product analytics
Problem to solve
We allow users to send their own custom events via the SDKs. However, we don't provide an easy way for users to visualize these custom events on their dashboards.
Customer example(s):
- I can instrument when
gdk update
is started and finishes but can't get the data out of product analytics. - I can instrument how many users save a Custom Dashboard but can't get the data out of the cluster
Technical details
Custom events are stored in the DB with the event_name
of custom_event
. Each custom event is then split into it's name and properties under the columns custom_event_name
and custom_event_props
respectively. Custom event properties can either be null
, undefined
, or an object/hash.
So when visualizing custom events we will need to cater for events that are only labels, as well as events with detailed differentiations within their properties. Both will need to be used for visualization purposes.
It is unlikely that the visualization designer will be able to query a users content for every custom event name, and then unique property details. Depending upon the amount of data, this would be computationally expensive, and it would need to be pretty smart to cover the variety of possible use-cases.
A more simple MVC for this feature would be to create an API and/or UI for users to create their own custom metrics and/or dimensions. The user would set what property they want to use and any potential properties we should check as well. This would then be added to the visualization designer for the user to make use of.
Customer case
We are in the process of instrumenting GDK to provide usage metrics/analytics: Track `gdk install` and `gdk update` usage (gitlab-development-kit#1746 - closed)
This will be fairly basic, probably tracking install, update, and a few other commands.
We will ideally collect the command, time taken, success, system spec (maybe architecture, os, ram etc) and potentially a user identifier (I need to understand PII implications etc :/
My initial thought was to send this data to GitLab as unstructured event data.
However, upon testing, it looks like we can only get a count of unstructured events, we cannot query any of the unstructured event data to separate updates/installs, group by architecture etc.
For now I will try and shoehorn the data into page views, but it would be great if we could add better support for unstructured events in the future.