Skip to content

WIP: Use sidekiq-cluster in GDK

Matthias Käppler requested to merge 38343-use-sk-cluster-in-gdk-poc into master

Issue: #38343 (closed)

What does this MR do?

This MR suggests how sidekiq-cluster could be used in the GDK in a straight forward way.

The approach is to add a new CLI switch -w INT which sets the desired scale in terms of worker processes explicitly, i.e. without relying on having to pass queue groupings as CLI arguments. This is simple and straight-forward:

  • by using e.g. -w 2, we run two worker processes
  • each worker process will process all queues

There are several functions in bin/background_jobs that are not useful anymore when running sidekiq-cluster, mostly concerning process supervision. We should remove that and leave process management to runit and sidekiq-cluster itself. This is also the direction Sidekiq itself is taking, with several CLI switches that relate to process management being removed in Sidekiq 6.

To that end, the following functions of background_jobs have been altered or removed:

  • Make start_foreground the default case; backgrounding shouldn't be done by default. This is now done explicitly instead (see next point)
  • Added start_background task; this should be considered a "special case" when neither the GDK nor a process supervisor is used. This used to be the default behavior of rake sidekiq:start so I retained it this way.
  • Deprecated restart task; restarts should be handled by a process supervisor instead. In fact, this task is now defunct. The problem is that when starting a cluster, we cannot restart ourselves with the same parameters, since unlike a process supervisor we don't have access to the original start script that was passing parameters. Moreover, for a clean restart we would have to poll and wait for all cluster workers to shut down cleanly first. However, the caller can always call stop, then start again with the initial argument list at their own discretion.
  • Renamed start_no_deamonize task; this is now start_silent, but the behavior is retained (run in foreground, log to sidekiq log file)
  • Dropped the -P, -d and -L switches; all of these are "process supervision" related, will be dropped in Sidekiq 6 and should be handled by runit instead

Non-goals:

  • Make sidekiq-cluster available to Omnibus users, esp. outside of the free tier. This will be tackled later. Our only audience here are developers working on GitLab.

TODO & open questions:

  • Add unit tests covering the new process count switch
  • Figure out impact of dropping functions that are considered runit's job such as restart; Should we retain rake tasks that bypass the GDK and runit?
  • Are we impacting or breaking "from source" installations now that might not use runit? https://gitlab.com/gitlab-org/gitlab/blob/master/doc/install/installation.md#installation-from-source
  • Make sure GDK migration is smooth
  • Make sure Omnibus migration is smooth (Update: Omnibus does not appear to be using this script, see below)
  • Should Omnibus be using bin/background_jobs to reduce/consolidate the number of entry points to sidekiq?
  • What's the role of Procfile these days and should we change it?

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Available to all users of the Gitlab Development Kit.

Edited by 🤖 GitLab Bot 🤖

Merge request reports