Sidekiq high-urgency job AuthorizedProjectsWorker not meeting performance targets 25% of the time

Resources

Dashboard: https://dashboards.gitlab.net/d/sidekiq-worker-detail/sidekiq3a-worker-detail?orgId=1&from=now-30d&to=now&timezone=utc&var-PROMETHEUS_DS=mimir-gitlab-gprd&var-environment=gprd&var-stage=main&var-worker=AuthorizedProjectsWorker

What

On GitLab Dedicated, running 17.11.x at the time of writing, we've had a number of incidents caused by the slow execution of AuthorizedProjectsWorker, marked as high urgency, not CPU bound.

Without any further saturation on the infrastructure side, these jobs do not fit within the requirements for the high job urgency.

This causes impact by slowing down other urgent-priority jobs.

For this example tenant (eligible_pink_eel -> visualization "Sidekiq - AuthorizedProjectsWorker slow execution"), over the span of 7 days, only 75% of AuthorizedProjectsWorker complete within 10s:

image

Extract from one sample log message with durations (sensitive fields removed):

{
  "time": "2025-06-04T14:43:12.824Z",
  "created_at": "2025-06-04T14:42:25.685Z",
  "enqueued_at": "2025-06-04T14:42:29.335Z",
  "completed_at": "2025-06-04T14:43:12.824Z",
  "container_name": "sidekiq",
  "class": "AuthorizedProjectsWorker",
  "queue": "urgent_other",
  "correlation_id": "01JWXP9DMR3K8XMT2WYEKPEPVN",
  "meta.caller_id": "Groups::GroupLinksController#destroy",
  "meta.feature_category": "permissions",
  "message": "AuthorizedProjectsWorker JID-c4e3f27a9da274e2f47a20d5: done: 21.913355 sec",
  "redis_calls": 19,
  "redis_duration_s": 2.531906,
  "redis_read_bytes": 35,
  "redis_write_bytes": 22603,
  "redis_queues_calls": 2,
  "redis_queues_duration_s": 0.000555,
  "redis_queues_read_bytes": 5,
  "redis_queues_write_bytes": 19526,
  "redis_queues_metadata_calls": 16,
  "redis_queues_metadata_duration_s": 2.231653,
  "redis_queues_metadata_read_bytes": 29,
  "redis_queues_metadata_write_bytes": 3004,
  "redis_shared_state_calls": 1,
  "redis_shared_state_duration_s": 0.299698,
  "redis_shared_state_read_bytes": 1,
  "redis_shared_state_write_bytes": 73,
  "db_count": 8,
  "db_write_count": 2,
  "db_primary_count": 8,
  "db_primary_write_count": 2,
  "db_primary_duration_s": 12.981,
  "db_main_count": 8,
  "db_main_write_count": 2,
  "db_main_duration_s": 12.981,
  "db_duration_s": 14.625909,
  "duration_s": 21.913355
}

Expected

These high urgency jobs should complete within 10s or be downgraded to low.

If this one is CPU intensive, it may need to be in the high-urgency CPU bound category instead.

Edited by 🤖 GitLab Bot 🤖