Skip to content

Implement Sidekiq throttling middleware in gitlab rails

This issue aims to implement a throttling middleware to adjust concurrency limit based on various database indicators. We'll use concurrency limit middleware as the mechanism to throttle number of concurrent jobs a worker is allowed to perform.

Indicators:

  1. Client-side DB duration per minute
  2. DB-side active connections

Throttling Effect:

  • If 1st indicator is violated, worker is subjected to a "soft throttle"
  • If both indicator is violated, worker is subjected to a "hard throttle"

Glossary

  • Throttle - Decrease current concurrency limit in Redis. SidekiqMiddleware::ConcurrencyLimit::Middleware is responsible for deferring the job into a queue and releasing it back.
  • Soft Throttle - current_limit * 0.8
  • Hard Throttle - current_limit * 0.5
  • Recovery - greater of current_limit + 1 or current_limit * 1.1 , when the current limit is below the max limit
  • (numbers above can change)

Prerequisites

  1. Track dynamic concurrency limit in Redis gitlab-com/gl-infra/data-access/durability/team#146 (closed)
  2. Configure starting max concurrency limit based on urgency and sidekiq shard fleet's max thread gitlab-com/gl-infra/data-access/durability/team#215 (closed) (ref)
    • eg a low urgency worker in catchall would be allowed to use 20% of the fleet (which is 2160 jobs, currently).
    • This allows the numbers to scale with the fleet size, ie for Dedicated and self-managed.
    • Individual workers could adjust this percentage, and we could allow configuration to override the count altogether.

Tasks

  1. Implement the throttling middleware based on the 2 indicators and effect above
  2. Implement the recovery background thread to restore current limit back to its max limit
Previous description

This issue is a placeholder to contain more focused discussions from https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/3775 w.r.t throttling mechanism.

Looking at https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/3775#note_2089502716, this is currently the top contender for how the throttling mechanism will buffer and release jobs using Gitlab::SidekiqMiddleware::ConcurrencyLimit.

Other aspects to consider:

References

Edited by Marco Gregorius