Add an SLI for the monitoring service reporting failed
In gitlab-com/runbooks!4140 (merged) we've merged a change that would try to record invalid metrics in some, but not all cases.
This generated a lot of errors in the logs with the message vector contains metrics with the same labelset after applying rule labels, and would cause the recording to be missing for those cases. This did not service in our monitoring dashboard, nor did we get any alerts. I noticed this when manually validating my changes.
I think we should try to include this in an SLI for our monitoring service (Prometheus & Thanos services after https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/14335), that way we could receive alerts for this.
Perhaps it would even be appropriate to label this SLI with feature_category='error_budgets' now that this exists: incorrect recordings, anywhere in the stack could influence metrics that are used for error budgets.