The runbook troubleshooting/large-sidekiq-queue.md is incorrect and could lead to data loss in the worst case, but likely just confusion during an incident
The runbook https://gitlab.com/gitlab-com/runbooks/blob/master/troubleshooting%2Flarge-sidekiq-queue.md states that jobs for users can be deleted using the following advice:
Dropping jobs for a specific user
Suppose user
foois generating a lot of import jobs. You can use the Sidekiq API in the Rails console to remove those specific jobs. To do this, we must first identify the arguments that are run with this Sidekiq job.
Find the worker in question in https://gitlab.com/gitlab-org/gitlab-ee/tree/master/app/workers. For example, jobs in the
repository_importqueue correspond torepository_import_worker.rb: https://gitlab.com/gitlab-org/gitlab-ee/blob/master/app/workers/repository_import_worker.rb.Look at the arguments specified in
def performmethod. In this example,project_idis the only argument.Now that we know we are looking for jobs that have a
project_id, we can find out which projects are owned by the user. In the Rails console (sudo gitlab-rails console):user = User.find_by(username: 'foo') id_list = user.projects.pluck(:id)To kill any matching projects, we can run the following in the same console:
queue = Sidekiq::Queue.new('repository_import') queue.each { |job| job.delete if id_list.include?(job.args[0]) }
During a high-pressure situation, an operator may assume that this script works for any queue. It is specific to the repository_import queue!
Once we deliver scalability#9 we will have a way of removing jobs in a generic manner, but until then we should place more emphasis on this not being a general approach.
Additionally, the runbooks do not currently reference the Sidekiq Monitor feature, and it's ability to cancel running jobs, as described in the documentation: https://gitlab.com/gitlab-org/gitlab-foss/blob/master/doc/administration/troubleshooting/sidekiq.md#canceling-running-jobs-destructive