Implement Federated Sidekiq APIs
From the curated list of interactions with redis-sidekiq, there are many components that require data across all instances. The purposes of data collection vary. The most surprising component is Sidekiq Web UI. It collects nearly all available Sidekiq data. In a zonal clusters setting, that component is properly broken.
I would like to propose a solution. We can implement a federation support layer on top of Sidekiq APIs. This federation layer is responsible for broadcasting queries and accumulate received data. A huge portion of the components use Sidekiq API exclusively. As a result, they have multi-instance support instantly. Other components issue Redis commands directly to the Redis instance. Those commands are trivial, considered to be an extension of Sidekiq APIs. We can push those commands to the same place. All further access to Sidekiq data must go through the federation layer. This improves the abstraction, isolation and hides away the complexity of handling multiple redis instance in the future.
By default, the Sidekiq APIs work at zonal level. The federation layer provides sort of an interface to access data at federation level. The interface must not change the underlying API. This is an example:
Gitlab::SidekiqFederation.federate do
q = Sidekiq::Queue.new(queue)
q.size # Total size of inner
q.find_job("ABC") # Return the job found in any of the instance
s = Sidekiq::Stats.new
s.scheduled_size # Return the total count of scheduled jobs across all clusters
end
One good use case for this API is the Sidekiq Web UI. This UI is implemented as an independent rack middleware. It is mounted to Rails application. We can create a thin rack middleware in front enabling the flag.
The implementation may involve patching Sidekiq API. We want this component compatible with future Sidekiq upgrade. At least, we have a way to control the upgrade. This federation support is also used outside of Rails application, like in gitlab-exporter. Therefore, creating a dedicated gem, let's name it gitlab-sidekiq-federation
, is a good approach.
I implemented a small POC for this federation concept. The POC is available at this commit. The technique is simple. It wraps the existing classes (::Sidekiq::Stats in this case) with a collection class. Any access to that wrapper instance broadcasts the query to the corresponding pool. The heavy parts are still handled by Sidekiq API. The federation layer is thin and straightforward.
Running the POC in the console, we'll get the accumulated result.
::Sidekiq::Stats.new.scheduled_size # Return current instance's stats
::Gitlab::SidekiqFederation.federate do
stats ::Sidekiq::Stats.new
puts stats.scheduled_size # Return accumulated scheduled size
puts stats.processed # Return accumulated processed jobs
stats.reset # Reset stats of all Redis instances
end
This implementation approach needs some more clarifications, such as singular job operations. But I think it's generally promising.