Sidekiq high-urgency job AuthorizedProjectsWorker not meeting performance targets 25% of the time
Resources
What
On GitLab Dedicated, running 17.11.x at the time of writing, we've had a number of incidents caused by the slow execution of AuthorizedProjectsWorker, marked as high urgency, not CPU bound.
Without any further saturation on the infrastructure side, these jobs do not fit within the requirements for the high job urgency.
This causes impact by slowing down other urgent-priority jobs.
For this example tenant (eligible_pink_eel -> visualization "Sidekiq - AuthorizedProjectsWorker slow execution"), over the span of 7 days, only 75% of AuthorizedProjectsWorker complete within 10s:
Extract from one sample log message with durations (sensitive fields removed):
{
"time": "2025-06-04T14:43:12.824Z",
"created_at": "2025-06-04T14:42:25.685Z",
"enqueued_at": "2025-06-04T14:42:29.335Z",
"completed_at": "2025-06-04T14:43:12.824Z",
"container_name": "sidekiq",
"class": "AuthorizedProjectsWorker",
"queue": "urgent_other",
"correlation_id": "01JWXP9DMR3K8XMT2WYEKPEPVN",
"meta.caller_id": "Groups::GroupLinksController#destroy",
"meta.feature_category": "permissions",
"message": "AuthorizedProjectsWorker JID-c4e3f27a9da274e2f47a20d5: done: 21.913355 sec",
"redis_calls": 19,
"redis_duration_s": 2.531906,
"redis_read_bytes": 35,
"redis_write_bytes": 22603,
"redis_queues_calls": 2,
"redis_queues_duration_s": 0.000555,
"redis_queues_read_bytes": 5,
"redis_queues_write_bytes": 19526,
"redis_queues_metadata_calls": 16,
"redis_queues_metadata_duration_s": 2.231653,
"redis_queues_metadata_read_bytes": 29,
"redis_queues_metadata_write_bytes": 3004,
"redis_shared_state_calls": 1,
"redis_shared_state_duration_s": 0.299698,
"redis_shared_state_read_bytes": 1,
"redis_shared_state_write_bytes": 73,
"db_count": 8,
"db_write_count": 2,
"db_primary_count": 8,
"db_primary_write_count": 2,
"db_primary_duration_s": 12.981,
"db_main_count": 8,
"db_main_write_count": 2,
"db_main_duration_s": 12.981,
"db_duration_s": 14.625909,
"duration_s": 21.913355
}
Expected
These high urgency jobs should complete within 10s or be downgraded to low.
If this one is CPU intensive, it may need to be in the high-urgency CPU bound category instead.
