[Discussion] Add support for SM/Dedicated for Managed Product Analytics provider
Problem Statement
- Product Analytics managed stack is complete and soon to be available to SaaS customers. It is currently being dogfooded by internal customers.
- Self-managed and Dedicated customers must currently run the product analytics stack themselves, using the provided k8s helm charts.
- Customers that do not wish to do this, cannot use product analytics.
Solution
groupcloud connector are creating a solution to this problem by exposing a set of APIs that allow non-GitLab.com installations to access shared managed backends provided by GitLab. It is likely that we should pivot to using this system.
I'm proposing, initially, that we would develop a cloud connector enabled solution for non-GitLab.com installations before migrating the existing GitLab.com managed infrastructure. The two flows are different, so would need to be implemented separately anyway.
I'm envisioning that the analytics manager would remain the single point of access to the managed analytics stack for onboarding and deprovisioning.
Implementation plan
Fundamentally, I think we'll need a new service that acts similarly to the AI Gateway; the "Monitor Gateway". It should handle all the requests from any GitLab instance for any of the Monitor features.
Onboarding a new project (not GitLab.com)
The Monitor Gateway should now be the single point of control for all PA calls, including querying cube. This is very different to the situation we have now, and a new analytics-manager should probably be developer in parallel to the current one.
- Use self-managed provider, if one provided by either instance admin, or project owner.
- SM instance regularly keeps access information up to date from customers.gitlab.com. (From the diagrams here, I think this is already handled)
- Project owner chooses "use managed-provider".
- SM instance calls
cloud.gitlab.com
, assuming that user is permitted to use feature. (In this case, Ultimate only - also add-on purchase from CDot required.) - Monitor Gateway fetches signing keys from customers.dot, if required.
- Monitor Gateway validates JWT received from SM instance.
- Monitor Gateway should then complete an onboarding process for the new instance. This should look similar to the current onboarding process for a new project, but should segment newly created ClickHouse databases by GitLab instance ID. (This is received via the JWT).
- For example, right now each project in the ClickHouse database is named
gitlab_project_PROJECT_ID
but should now begitlab_project_XXXXX
whereXXXXX
is a hash of both the instance ID, and project ID.
- For example, right now each project in the ClickHouse database is named
- Once provisioned, the connection details should be returned to the SM instance and stored at the instance level. I'm not yet sure what this payload should include.
This section of work is likely to be extensive. We would be rebuilding the vast majority of the current onboarding flow.
Querying cube
The authentication and querying flow to cube would have to move from a single API key, stored at the instance level to the use of JWKS and a security context.
Currently, the GitLab API generates a JWT that grants access to the correct project. The encryption key is stored at the instance-level.
We should validate if we can use the same key from CDot here. Can we verify the claim which would include the instance ID as the sub
field which would help us deduce the correct CH database to query.
We would also have to combine the claim with the project ID, but that should be possible.