Remove fallback to deployment_platform_cluster in `DeploymentMetrics`
Background
For each Merge Request, we display performance metrics if we detect an associated environment has metrics. For example, see below screenshot
The metrics could a manually setup Prometheus service or a in-cluster Prometheus.
Problem to solve
Two problems.
- ~bug - We compute which cluster the MR is. Because users can add/edit/delete clusters this means that this computation can return not the cluster that MR was actually deployed to.
- ~performance - the majority of deployments do not have prometheus of the cluster form. However, in https://gitlab.com/gitlab-org/gitlab-ce/issues/63475, we have to waste resources computing
deployment_platform_cluster
. At least in query count performance
Proposal
- For new deployments, save which cluster it was deployed to and use that instead. Keep the fallback to live compute for old MRs. DONE in https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/29960.
- After 1 month we should remove the fallback and let these old MRs lose their metrics. If they are that old then metrics aren't likely to be useful anyway and also all they need to do is trigger a redeploy to get their metrics again
Can we backfill ?
I think not, see 1 - we will be baking in the wrong cluster for all time if we persist. Unfortunately there is no way of telling what value of cluster_id
old deployments.
The impact is that old MRs which previously returned performance metrics will no longer do so unless the next deployment (pipeline). This will only affect MRs which deployed to Kubernetes clusters.
Links
- See discussion in https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/30224#note_187003879
- Original issue https://gitlab.com/gitlab-org/gitlab-ce/issues/63475
Edited by Thong Kuah