Sidekiq stuck on Redis or MySQL connections

Customer has an HA setup with a MySQL database and Redis Sentinel. One node appears to have stuck Sidekiq jobs. When we first logged in, there were about 44 running jobs. About 30 minutes later, it crept up slowly to 55.

Attached is the redacted TTIN output. It looks like most of the threads have one of three backtraces:

DB release_connection

2017-04-24_21:19:41.83783 2017-04-24T21:19:41.836Z 30447 TID-otapkuv4w WARN: Thread TID-otan7jcis 
2017-04-24_21:19:41.83783 2017-04-24T21:19:41.836Z 30447 TID-otapkuv4w WARN: /opt/gitlab/embedded/lib/ruby/2.3.0/monitor.rb:187:in `lock'
2017-04-24_21:19:41.83783 /opt/gitlab/embedded/lib/ruby/2.3.0/monitor.rb:187:in `mon_enter'
2017-04-24_21:19:41.83783 /opt/gitlab/embedded/lib/ruby/2.3.0/monitor.rb:212:in `mon_synchronize'
2017-04-24_21:19:41.83783 /opt/gitlab/embedded/lib/ruby/gems/2.3.0/gems/activerecord-4.2.7.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:280:in `release_connection'
2017-04-24_21:19:41.83784 /opt/gitlab/embedded/lib/ruby/gems/2.3.0/gems/activerecord-4.2.7.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:552:in `each'
2017-04-24_21:19:41.83784 /opt/gitlab/embedded/lib/ruby/gems/2.3.0/gems/activerecord-4.2.7.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:552:in `clear_active_connections!'
2017-04-24_21:19:41.83784 /opt/gitlab/embedded/lib/ruby/gems/2.3.0/gems/activerecord-4.2.7.1/lib/active_record/connection_handling.rb:129:in `clear_active_connections!'
2017-04-24_21:19:41.83784 /opt/gitlab/embedded/lib/ruby/gems/2.3.0/gems/sidekiq-4.2.7/lib/sidekiq/middleware/server/active_record.rb:8:in `call'
2017-04-24_21:19:41.83785 /opt/gitlab/embedded/lib/ruby/gems/2.3.0/gems/sidekiq-4.2.7/lib/sidekiq/middleware/chain.rb:130:in `block in invoke'
2017-04-24_21:19:41.83785 /opt/gitlab/embedded/lib/ruby/gems/2.3.0/gems/sidekiq-4.2.7/lib/sidekiq/middleware/server/retry_jobs.rb:74:in `call'
2017-04-24_21:19:41.83785 /opt/gitlab/embedded/lib/ruby/gems/2.3.0/gems/sidekiq-4.2.7/lib/sidekiq/middleware/chain.rb:130:in `block in invoke'

DB retrieve_connection

2017-04-24_21:19:41.85436 2017-04-24T21:19:41.853Z 30447 TID-otapkuv4w WARN: Thread TID-otapkqrk4 
2017-04-24_21:19:41.85436 2017-04-24T21:19:41.853Z 30447 TID-otapkuv4w WARN: /opt/gitlab/embedded/lib/ruby/2.3.0/monitor.rb:187:in `lock'
2017-04-24_21:19:41.85437 /opt/gitlab/embedded/lib/ruby/2.3.0/monitor.rb:187:in `mon_enter'
2017-04-24_21:19:41.85437 /opt/gitlab/embedded/lib/ruby/2.3.0/monitor.rb:212:in `mon_synchronize'
2017-04-24_21:19:41.85437 /opt/gitlab/embedded/lib/ruby/gems/2.3.0/gems/activerecord-4.2.7.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:262:in `connection'
2017-04-24_21:19:41.85437 /opt/gitlab/embedded/lib/ruby/gems/2.3.0/gems/activerecord-4.2.7.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:571:in `retrieve_connection'
2017-04-24_21:19:41.85437 /opt/gitlab/embedded/lib/ruby/gems/2.3.0/gems/activerecord-4.2.7.1/lib/active_record/connection_handling.rb:113:in `retrieve_connection'
2017-04-24_21:19:41.85438 /opt/gitlab/embedded/lib/ruby/gems/2.3.0/gems/activerecord-4.2.7.1/lib/active_record/connection_handling.rb:87:in `connection'
2017-04-24_21:19:41.85438 /opt/gitlab/embedded/lib/ruby/gems/2.3.0/gems/activerecord-4.2.7.1/lib/active_record/core.rb:187:in `find_by'
2017-04-24_21:19:41.85438 /opt/gitlab/embedded/service/gitlab-rails/app/workers/build_queue_worker.rb:6:in `perform'
2017-04-24_21:19:41.85438 /opt/gitlab/embedded/lib/ruby/gems/2.3.0/gems/sidekiq-4.2.7/lib/sidekiq/processor.rb:158:in `execute_job'
2017-04-24_21:19:41.85438 /opt/gitlab/embedded/lib/ruby/gems/2.3.0/gems/sidekiq-4.2.7/lib/sidekiq/processor.rb:138:in `block (4 levels) in process'

Redis select

2017-04-24_21:19:41.84037 2017-04-24T21:19:41.839Z 30447 TID-otapkuv4w WARN: Thread TID-otan7j2wo 
2017-04-24_21:19:41.84037 2017-04-24T21:19:41.840Z 30447 TID-otapkuv4w WARN: /opt/gitlab/embedded/lib/ruby/gems/2.3.0/gems/redis-3.2.2/lib/redis/connection/ruby.rb:52:in `select'
2017-04-24_21:19:41.84038 /opt/gitlab/embedded/lib/ruby/gems/2.3.0/gems/redis-3.2.2/lib/redis/connection/ruby.rb:52:in `rescue in _read_from_socket'
2017-04-24_21:19:41.84038 /opt/gitlab/embedded/lib/ruby/gems/2.3.0/gems/redis-3.2.2/lib/redis/connection/ruby.rb:48:in `_read_from_socket'
2017-04-24_21:19:41.84038 /opt/gitlab/embedded/lib/ruby/gems/2.3.0/gems/redis-3.2.2/lib/redis/connection/ruby.rb:41:in `gets'
2017-04-24_21:19:41.84038 /opt/gitlab/embedded/lib/ruby/gems/2.3.0/gems/redis-3.2.2/lib/redis/connection/ruby.rb:273:in `read'
2017-04-24_21:19:41.84038 /opt/gitlab/embedded/lib/ruby/gems/2.3.0/gems/redis-3.2.2/lib/redis/client.rb:249:in `block in read'
2017-04-24_21:19:41.84039 /opt/gitlab/embedded/lib/ruby/gems/2.3.0/gems/redis-3.2.2/lib/redis/client.rb:237:in `io'
2017-04-24_21:19:41.84039 /opt/gitlab/embedded/lib/ruby/gems/2.3.0/gems/redis-3.2.2/lib/redis/client.rb:248:in `read'
2017-04-24_21:19:41.84039 /opt/gitlab/embedded/lib/ruby/gems/2.3.0/gems/redis-3.2.2/lib/redis/client.rb:113:in `block in call'
2017-04-24_21:19:41.84039 /opt/gitlab/embedded/lib/ruby/gems/2.3.0/gems/redis-3.2.2/lib/redis/client.rb:218:in `block (2 levels) in process'
2017-04-24_21:19:41.84039 /opt/gitlab/embedded/lib/ruby/gems/2.3.0/gems/redis-3.2.2/lib/redis/client.rb:354:in `ensure_connected'

This suggests to me that Sidekiq got into some deadlock trying to synchronize threads for the database connection pool. Some relevant issues that point the blame to issues with Timeout.timeout:

  • https://github.com/rails/rails/issues/10039
  • https://github.com/rails/rails/issues/22031
  • https://github.com/mperham/sidekiq/issues/1047#issuecomment-20641031

Note that BOTH mysql2 and redis-rb gems have updates to fix issues with Timeout:

  • https://github.com/brianmario/mysql2/releases
  • https://github.com/redis/redis-rb/blob/master/CHANGELOG.md#333

Here is what we are doing about it:

  • Upgraded to the latest mysql2 gem: https://gitlab.com/gitlab-org/gitlab-ee/merge_requests/1715
  • Upgraded to the latest Sidekiq (still uses redis-rb 3.2 that uses Timeout): https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/10899
  • Upgraded Sidekiq to the latest redis-rb version that drops the use of Timeout: https://github.com/mperham/sidekiq/pull/3441
  • Created an issue to remove Timeout from our LDAP code: https://gitlab.com/gitlab-org/gitlab-ce/issues/31369

sidekiq.txt

/cc: @gl-support

ZenDesk: https://gitlab.zendesk.com/agent/tickets/72840, https://gitlab.zendesk.com/agent/tickets/56118

Edited Apr 22, 2025 by 🤖 GitLab Bot 🤖
Assignee Loading
Time tracking Loading