The runbook troubleshooting/large-sidekiq-queue.md is incorrect and could lead to data loss in the worst case, but likely just confusion during an incident

The runbook https://gitlab.com/gitlab-com/runbooks/blob/master/troubleshooting%2Flarge-sidekiq-queue.md states that jobs for users can be deleted using the following advice:

Dropping jobs for a specific user

Suppose user foo is generating a lot of import jobs. You can use the Sidekiq API in the Rails console to remove those specific jobs. To do this, we must first identify the arguments that are run with this Sidekiq job.

  1. Find the worker in question in https://gitlab.com/gitlab-org/gitlab-ee/tree/master/app/workers. For example, jobs in the repository_import queue correspond to repository_import_worker.rb: https://gitlab.com/gitlab-org/gitlab-ee/blob/master/app/workers/repository_import_worker.rb.

  2. Look at the arguments specified in def perform method. In this example, project_id is the only argument.

Now that we know we are looking for jobs that have a project_id, we can find out which projects are owned by the user. In the Rails console (sudo gitlab-rails console):

user = User.find_by(username: 'foo')
id_list = user.projects.pluck(:id)

To kill any matching projects, we can run the following in the same console:

queue = Sidekiq::Queue.new('repository_import')
queue.each { |job| job.delete if id_list.include?(job.args[0]) }

During a high-pressure situation, an operator may assume that this script works for any queue. It is specific to the repository_import queue!

Once we deliver scalability#9 we will have a way of removing jobs in a generic manner, but until then we should place more emphasis on this not being a general approach.


Additionally, the runbooks do not currently reference the Sidekiq Monitor feature, and it's ability to cancel running jobs, as described in the documentation: https://gitlab.com/gitlab-org/gitlab-foss/blob/master/doc/administration/troubleshooting/sidekiq.md#canceling-running-jobs-destructive