Skip to content

Fix daemon memory killer jobs hash thread safety issue

What does this MR do?

There is a bug can't add a new key into hash during iteration reported from Sentry https://sentry.gitlab.net/gitlab/devgitlaborg/issues/1837678.

The code causes this bug is: https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/sidekiq_daemon/memory_killer.rb#L233

      def rss_increase_by_jobs
        Gitlab::SidekiqDaemon::Monitor.instance.jobs.sum do |job| # rubocop:disable CodeReuse/ActiveRecord
          rss_increase_by_job(job)
        end
      end

This code jobs.sum will enter iteration without Mutex.synchronize.

During the jobs.sum iteration, other Sidekiq job could call Gitlab::SidekiqDaemon::Monitor.instance.within_job(https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/sidekiq_middleware/monitor.rb#L7). This will call jobs[jid] = { worker_class: worker_class, thread: Thread.current, started_at: Gitlab::Metrics::System.monotonic_time }. Now we have a jobs[jid]= assignment happen during jobs iteration, thus we get the error.

The fix is to guard this code piece jobs.sum with job_mutex.

Local tests have been performed #249519 (comment 413582459)

Screenshots

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team

Closes #249519 (closed)

Edited by Qingyu Zhao

Merge request reports