Track Jobs and Workers classes by endpoint

Proposal

Similarly to controllers and API endpoints, data about jobs and workers classes should also be turned into metrics to be sent to Prometheus, especially now that these classes have also been tagged with a feature_category attribute.

We should be able to reuse (maybe not entirely) existing abstractions that were set up when we started tracking calls to app and api controllers. Existing classes, such as the Metrics::RequestsEndpoints class and the custom Collector class, can be leveraged.

Sidekiq Middleware class

To track worker classes (ie. classes under app/workers) and job classes (ie. classes under 'app/jobs'), we'll have to wrap each job execution with Benchmark.realtime to capture its duration, like the way it's done in the EmitsPrometheusMetrics module.

Task breakdown

This is an attempt to see what needs to be done as part of this issue. Each task below should be achieved with one or two MRs:

  • Refactor EmitsPrometheusMetrics so that we can use it within workers/jobs context: we'll need to extract that logic from app/controllers/concerns and make it usable by the future Sidekiq Middleware class, in addition to the api and app controllers it already serves today.
  • Update list of endpoints to be initialized, at Collector initialization: we'll have to add new endpoint label values (based on jobs and worker class names) to the existing Prometheus counters.
  • Add Sidekiq Middleware class: bring all this work together by creating a Sidekiq Middleware Server class, which will wrap the execution of each job with methods provided by the refactored EmitsPrometheusMetrics module.
  • Update Runbooks accordingly: add the gitlab_sli_sidekiq_execution* SLIs in the runbooks.
Edited by Etienne Baqué