Make Sidekiq SLIs explorable in the error budget for stage groups dashboard
DRI @marcogreg As `@smcgivern` brings up in https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/1365#note_728055809 `sidekiq_execution` isn't a real SLI, it's an aggregation we've tacked on in https://gitlab.com/gitlab-com/runbooks/blob/b08d1478fcc8b6ae05828710984da8976504da01/rules-jsonnet/sidekiq-feature-category-source-metrics.jsonnet#L9. They aggregate all executions into a single SLI, but the service metrics aggregate both the queueing and the execution into a single SLI per shard. The queueing is not something that can easily be influenced by stage groups, but it is affected by the execution somewhat. I think we should work to remove the disconnect between feature category recordings and service monitoring recordings, so Sidekiq is no longer a special case. &525 worked on making the `puma` component feature recordings in line with the service recordings, we could do the same for Sidekiq. This will make Sidekiq explorable on the error budget for stage group's detail dashboard including the breakdown by Significant label, which includes `worker`. At the same time, the number of series that need to be recorded would be reduced. ### Proposals Rework the Sidekiq SLIs to use only counters for successes, totals and errors based on the tools calculations we already have for [Application SLIs](https://docs.gitlab.com/ee/development/application_slis/). Then we can replace the current SLIs for execution and queuing with SLIs using these metrics, and reduce the metrics we need to record for ## Status 2023-08-14 Note for Grand Review: The epic is done and can be closed. The last piece of the [docs for stage groups](https://docs.gitlab.com/ee/development/application_slis/sidekiq_execution.html) awareness is up. Key changes and benefits were announced in this week's Engineering Week in Review doc. Summary of the impacts: * Stage groups can now see successes and failures (as apdex and error ratio) per worker in the Application SLI Violations dashboard - https://dashboards.gitlab.net/goto/WrYKTre4g?orgId=1. * Infrastructure and Development now use the same definition for sidekiq execution, where before sidekiq execution and queuing were mixed for Infrastructure. * Infrastructure now has separate alerting for sidekiq queuing, this SLI is owned by infrastructure and not stage groups. When stage groups meet the execution SLI, infrastructure should be able to meet the queueing one. * ~2 million fewer metrics emitted from GitLab.com. More histogram metrics can be removed in https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/2297#list-of-metrics. Future plan for self-managed and Dedicated is discussed in https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/2474.
epic