Track Historical Information for the gitlab_subscriptions table
Why is this Important?
Currently our GitLab.com gitlab_subscriptions
table only captures the current state of our data. This is blocking us from doing any meaningful analysis on our subscriptions because we can only see who's subscribed right now. We have no data on who was subscribed two weeks ago or even two hours ago, all we know about is the state in this current moment.
The reasoning is because of the backend design of our application. In our database all records are overwritten to reflect the current state, which is a good design from an application point-of-view, but makes analysis of our data impossible. This "throwing out" of data is causing GitLab Inc. to miss out on capturing tons of valuable information.
For example: if a customers starts a trial on gitlab.com, we'll create a gitlab_subscriptions
record with trial=true
. If they later upgrade to a Gold plan, the existing record gets overwritten with a new record that has trial=false
. By throwing out all information about the trial ever existing, we'll never be able to measure the value or success of trials at GitLab. The exact same thing can be said about free plans, we currently have no way to see any of those conversions.
By preserving historical information, we will be able to calculate the value of trials, free plans, and the LTV of all visitors to GitLab.com. Paul Machle (CFO) has asked us multiple times for an LTV analysis on free users to GitLab.com and this issue is the single thing blocking us from doing that.
Proposal
We plan to track changes to the gitlab_subscriptions
using Snowplow events.
The gitlab_subscripition records the current state of our gitlab_subscriptions. The records are slowly mutable. On the contrary, Snowplow will receive and store sequences of immutable events. Upon any creation, update or deletion event (does that happen ?) in our database a new row will be added with all metadata attached to the specific object. This table will allow us to keep track of all changes happening in the database. For example status changes of a specific subscription (from active to expire, or from trial started to active subscriptions...)
Requirement
Ideally, this tacking event will have all columns that are in gitlab_subscriptions` + extra metadata:
- change_type: creation, update, deletion
Preserve Historical Information for thegitlab_subscriptions
Table
Design
Resorted back to a database based solution for the gitlab_subscriptions
table as it will be quicker to unblock blocked data issues without telemetry concerns.
Performance
-
We must make sure backend tracking events are not lost, and have a certain degree of reliability. -
Only Gitlab.com