Skip to content

Audit unused Sidekiq metrics

Context

As @reprazent said in gitlab-org/gitlab!116827 (comment 1351399791), we can remove some Sidekiq server metrics in https://gitlab.com/gitlab-org/gitlab/-/blob/f2021fd6b8b0a8a23451db20890a8dc4e29bf51f/lib/gitlab/sidekiq_middleware/server_metrics.rb#L23-41

The scope of this issue is to determine which metrics can be removed. In case the metric is used on some dashboards, we could direct those to visualizations on the logs instead.

List of metrics

Metrics emitted from Sidekiq Middleware

"Used in .com?" field refers to whether the metric is referenced in runbooks repo.

(Scroll right for more fields)

Metric name Type Used in .com? Equivalent kibana logs field Status Remarks

sidekiq_jobs_completion_seconds

Histogram

dashboards

json.duration_s

Removed in .com gitlab-org/gitlab!128706 (merged) with ops FF emit_sidekiq_histogram_metrics

Replaced with Application SLI for Apdex measurement.

Dashboards replaced with Kibana viz.

sidekiq_jobs_queue_duration_seconds

Histogram

dashboards

json.scheduling_latency_s

Removed in .com gitlab-org/gitlab!128706 (merged) (merged) with ops FF emit_sidekiq_histogram_metrics

Replaced with Application SLI for Apdex measurement.

Dashboards replaced with Kibana viz.

sidekiq_jobs_failed_total

Counter

dashboards

NA

Removed in .com gitlab-org/gitlab!128706 (merged) (merged) with ops FF emit_sidekiq_histogram_metrics

Replaced with Application SLI for Apdex measurement.

sidekiq_jobs_cpu_seconds

Histogram

dashboards

json.cpu_s

Removed in gitlab-org/gitlab!131001 (merged)

To be replaced with average instead of quantile

sidekiq_jobs_db_seconds

Histogram

dashboards

json.db_duration_s

Removed in gitlab-org/gitlab!131001 (merged)

To be replaced with average instead of quantile

sidekiq_jobs_gitaly_seconds

Histogram

dashboards

json.gitaly_duration_s

Removed in gitlab-org/gitlab!131001 (merged) (merged)

To be replaced with average instead of quantile

sidekiq_redis_requests_duration_seconds

Histogram

dashboards

json.redis_duration_s

Removed in gitlab-org/gitlab!131001 (merged) (merged)

To be replaced with average instead of quantile

sidekiq_elasticsearch_requests_duration_seconds

Histogram

dashboards

json.elasticsearch_duration_s

Removed in gitlab-org/gitlab!131001 (merged) (merged)

To be replaced with average instead of quantile

sidekiq_jobs_retried_total

Counter

NA Keep

sidekiq_jobs_interrupted_total

Counter

NA Keep

sidekiq_redis_requests_total

Counter

NA Keep

sidekiq_elasticsearch_requests_total

Counter

NA Keep

sidekiq_running_jobs

Gauge

dashboards

NA Keep

sidekiq_concurrency

Gauge

dashboards

NA Keep

sidekiq_mem_total_bytes

Gauge

NA Keep

No references in runbooks but was recently added gitlab-org/gitlab!92785 (merged)

Status 2023-11-09

We have stopped emitting all histograms from Sidekiq in GitLab.com. Histograms are high in cardinality and only provides limited accuracy (depends on the number of buckets defined).

Summary of removed histograms:

We control the removed histograms by disabling an ops feature flag emit_sidekiq_histogram_metrics (enabled by default).

Edited by Marco Gregorius