[Discussion] Add support for SM/Dedicated for Managed Product Analytics provider

Problem Statement

  • Product Analytics managed stack is complete and soon to be available to SaaS customers. It is currently being dogfooded by internal customers.
  • Self-managed and Dedicated customers must currently run the product analytics stack themselves, using the provided k8s helm charts.
  • Customers that do not wish to do this, cannot use product analytics.

Solution

~"group::cloud connector" are creating a solution to this problem by exposing a set of APIs that allow non-GitLab.com installations to access shared managed backends provided by GitLab. It is likely that we should pivot to using this system.

I'm proposing, initially, that we would develop a cloud connector enabled solution for non-GitLab.com installations before migrating the existing GitLab.com managed infrastructure. The two flows are different, so would need to be implemented separately anyway.

I'm envisioning that the analytics manager would remain the single point of access to the managed analytics stack for onboarding and deprovisioning.

Implementation plan

Fundamentally, I think we'll need a new service that acts similarly to the AI Gateway; the "Monitor Gateway". It should handle all the requests from any GitLab instance for any of the Monitor features.

Onboarding a new project (not GitLab.com)

The Monitor Gateway should now be the single point of control for all PA calls, including querying cube. This is very different to the situation we have now, and a new analytics-manager should probably be developer in parallel to the current one.

  • Use self-managed provider, if one provided by either instance admin, or project owner.
  • SM instance regularly keeps access information up to date from customers.gitlab.com. (From the diagrams here, I think this is already handled)
  • Project owner chooses "use managed-provider".
  • SM instance calls cloud.gitlab.com, assuming that user is permitted to use feature. (In this case, Ultimate only - also add-on purchase from CDot required.)
  • Monitor Gateway fetches signing keys from customers.dot, if required.
  • Monitor Gateway validates JWT received from SM instance.
  • Monitor Gateway should then complete an onboarding process for the new instance. This should look similar to the current onboarding process for a new project, but should segment newly created ClickHouse databases by GitLab instance ID. (This is received via the JWT).
    • For example, right now each project in the ClickHouse database is named gitlab_project_PROJECT_ID but should now be gitlab_project_XXXXX where XXXXX is a hash of both the instance ID, and project ID.
  • Once provisioned, the connection details should be returned to the SM instance and stored at the instance level. I'm not yet sure what this payload should include.

This section of work is likely to be extensive. We would be rebuilding the vast majority of the current onboarding flow.

Querying cube

The authentication and querying flow to cube would have to move from a single API key, stored at the instance level to the use of JWKS and a security context.

Currently, the GitLab API generates a JWT that grants access to the correct project. The encryption key is stored at the instance-level.

We should validate if we can use the same key from CDot here. Can we verify the claim which would include the instance ID as the sub field which would help us deduce the correct CH database to query.

We would also have to combine the claim with the project ID, but that should be possible.

Edited by 🤖 GitLab Bot 🤖