Customer Product Usage Data Investigation
**Business Initiatives Supported**
- [Gitlab.com <> Customers, License, Zuora Integration](https://gitlab.com/groups/gitlab-org/-/epics/3602)
- [FY21-Q3 Deploying Telemetry for CRO Org](https://gitlab.com/groups/gitlab-com/-/epics/736)
- [CRO KR: 100% of portal data is visible in SFDC](https://gitlab.com/gitlab-com/sales-team/field-operations/sales-operations/-/issues/889)
## Overview ##
To enable self-service of customer-level product usage data analyses in Sisense and establish data pumps to Gainsight/SFDC to expose product usage data in more environments, we need to identify the necessary roadmap of changes to fix underlying data integration issues between Gitlab.com, Customers, License, Zuora, and Version. This roadmap should be informed by a dimensional model design of customer product usage data, as well as the business questions and analyses that require a robust underlying integration to answer.
## Work to be Done ##
- [ ] Develop first iteration Dimensional Model design for Customer product usage data
- [ ] Develop first iteration ERD for the Dimensional Model
- [ ] Investigate and Document integration challenges at the customer level between
- Gitlab <> Customers
- Customers <> License
- Customers <> Zuora
- License <> Zuora
- License <> Version
- [ ] Develop wave 2 Telemetry metrics for Master Subscription Product Usage Data [dashboard](https://app.periscopedata.com/app/gitlab/686439/WIP:-Master-Subscription-Product-Usage-Data-Process)
- [ ] Add documentation to the handbook via [Documentation/Taxonomy around Metrics](https://gitlab.com/gitlab-data/analytics/-/issues/5220)
## Detail ##
**Develop first iteration Dimensional Model design for Customer product usage data**
Identify appropriate facts and dimensions to develop a single view of our customers and potential customers across Gitlab, Customers, and Zuora, scoped to the Lead-to-Cash flow. This will include free customers on Gitlab.com and self-managed, gitlab.com and self-managed trials, and gitlab.com and self-managed paying customers.
**Develop first iteration ERD for the Dimensional Model**
Create ERD that defines the minimal necessary join keys to create a single view of customers across Gitlab, Customers, Zuora, License and Version. The fields defined in this ERD will serve as the basis for investigation of the gaps in current-state integrations.
**Investigate and Document integration challenges at the customer level**
Based on the ideal-state dimensional model ERD, document the availability of data and resulting business impacts based on current-state integrations. As an example, extending the analysis in [Document data integrity issues between GitLab.com <> Customers](https://gitlab.com/gitlab-data/analytics/-/issues/5254) to the other relevant integrations.
**Develop wave 2 Telemetry metrics for Master Subscription Product Usage Data dashboard**
Develop and append the defined wave 2 [Telemetry metrics](https://docs.google.com/spreadsheets/d/1ZR7duYmjQ8x86iAJ1dCix88GTtPlOyNwiMgeG_85NiA/edit#gid=0) to the existing wave 1 metrics in the Master Subscription Product Usage Data set, following a similar methodology of constraining scope to only the set of subscriptions and customers that can be stitched with accuracy between each system based on current-state integrations. Additional documentation will be created as part of this effort as business impacts are highlighted based on data availability.
**Add documentation to the handbook via Documentation/Taxonomy around Metrics**
Based on the preceding investigations, incorporate documentation into the handbook on how to accurately define, query, and interpret customer product usage data based on current-state integrations. Merge ideal-state Dimensional Model documentation into the handbook.
epic