Make it easier to find and kill a JID from a Sidekiq worker
In https://gitlab.com/gitlab-com/infrastructure/issues/2746 we found that for long-running Sidekiq jobs, it's difficult to figure out which machine and PID is processing that job. In order to kill the job, we had to:
- Look in the Kibana logs to find the last Sidekiq "start" entry
- Log into the machine, issue a
TSTPsignal on the Sidekiq worker to make it stop accepting new work - Wait for a while for other jobs to finish
- Forcibly kill the Sidekiq process via
kill -9
What we may want to consider:
- Add a status page in the Sidekiq admin panel that allows us to see all JIDs and which nodes/PIDs are processing which jobs (perhaps using https://github.com/mperham/sidekiq/wiki/API#workers)
- Add some sort of "ban" button to make that JID a NOP for the next hour
/cc: @ayufan