SidekiqStatus should not report a failed Sidekiq job as running

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Summary

When an exception is raised in a Sidekiq job, the job's SidekiqStatus becomes invalid until expiration (default 30 min).

Steps to reproduce

  1. Modify a job to raise an exception.
  2. Enqueue the job and note its job ID
  3. Wait a few seconds for the job to run and error.
  4. Observe that SidekiqStatus.running?(job_id) returns true (which is incorrect since it already died)

What is the current bug behavior?

When an exception is raised in a Sidekiq job, the job's SidekiqStatus becomes invalid until expiration (default 30 min).

This exacerbated another problem here: https://gitlab.com/gitlab-com/gl-infra/gitlab-dedicated/team/-/issues/8002#note_2386385834

What is the expected correct behavior?

When an exception is raised in a Sidekiq job, the job's SidekiqStatus becomes immediately unset (if possible).

Possible fixes

When an exception is raised in a Sidekiq job, we should still attempt SidekiqStatus.unset, to avoid orphaning the key in Redis.

This is where it is normally unset https://gitlab.com/gitlab-org/gitlab/-/blob/v17.9.0-ee/lib/gitlab/sidekiq_status/server_middleware.rb#L9

Edited by 🤖 GitLab Bot 🤖