Cron definition causes high CPU usage because of an endless loop
Summary
We noticed continuous high CPU usage a Gitlab EE instance that was previously running fine based on the stack trace we ended up tracking down the cause to an invalid cron definition using the following steps:
- Dumped sidekiq thread dump
kill -TTIN <sidekiq_pid>
- Compared normal operation dump vs. high cpu dump, this stood out:
2019-03-20_18:36:54.79856 2019-03-20T18:36:54.798Z 2977 TID-ot8g7o9m9 WARN: Thread TID-ot89u8ea9 scheduler
2019-03-20_18:36:54.79857 2019-03-20T18:36:54.798Z 2977 TID-ot8g7o9m9 WARN: /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/tzinfo-1.2.5/lib/tzinfo/time_or_datetime.rb:321:in `wrap'
2019-03-20_18:36:54.79857 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/tzinfo-1.2.5/lib/tzinfo/timezone.rb:420:in `utc_to_local'
2019-03-20_18:36:54.79857 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/et-orbi-1.1.7/lib/et-orbi/eo_time.rb:358:in `to_time'
2019-03-20_18:36:54.79858 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/et-orbi-1.1.7/lib/et-orbi/eo_time.rb:203:in `block (2 levels) in <class:EoTime>'
2019-03-20_18:36:54.79858 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/fugit-1.1.7/lib/fugit/cron.rb:76:in `block (2 levels) in <class:TimeCursor>'
2019-03-20_18:36:54.79858 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/fugit-1.1.7/lib/fugit/cron.rb:119:in `sec_match?'
2019-03-20_18:36:54.79858 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/fugit-1.1.7/lib/fugit/cron.rb:252:in `block in previous_time'
2019-03-20_18:36:54.79858 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/fugit-1.1.7/lib/fugit/cron.rb:232:in `loop'
2019-03-20_18:36:54.79859 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/fugit-1.1.7/lib/fugit/cron.rb:232:in `previous_time'
2019-03-20_18:36:54.79859 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/sidekiq-cron-1.0.4/lib/sidekiq/cron/job.rb:561:in `not_past_scheduled_time?'
2019-03-20_18:36:54.79859 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/sidekiq-cron-1.0.4/lib/sidekiq/cron/job.rb:23:in `block in should_enque?'
2019-03-20_18:36:54.79859 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/sidekiq-5.2.5/lib/sidekiq.rb:97:in `block in redis'
2019-03-20_18:36:54.79860 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/connection_pool-2.2.2/lib/connection_pool.rb:65:in `block (2 levels) in with'
2019-03-20_18:36:54.79860 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/connection_pool-2.2.2/lib/connection_pool.rb:64:in `handle_interrupt'
2019-03-20_18:36:54.79860 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/connection_pool-2.2.2/lib/connection_pool.rb:64:in `block in with'
2019-03-20_18:36:54.79860 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/connection_pool-2.2.2/lib/connection_pool.rb:61:in `handle_interrupt'
2019-03-20_18:36:54.79861 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/connection_pool-2.2.2/lib/connection_pool.rb:61:in `with'
2019-03-20_18:36:54.79861 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/sidekiq-5.2.5/lib/sidekiq.rb:94:in `redis'
2019-03-20_18:36:54.79861 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/sidekiq-cron-1.0.4/lib/sidekiq/cron/job.rb:21:in `should_enque?'
2019-03-20_18:36:54.79862 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/sidekiq-cron-1.0.4/lib/sidekiq/cron/job.rb:42:in `test_and_enque_for_time!'
2019-03-20_18:36:54.79862 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/sidekiq-cron-1.0.4/lib/sidekiq/cron/poller.rb:28:in `enqueue_job'
2019-03-20_18:36:54.79862 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/sidekiq-cron-1.0.4/lib/sidekiq/cron/poller.rb:15:in `block in enqueue'
2019-03-20_18:36:54.79863 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/sidekiq-cron-1.0.4/lib/sidekiq/cron/poller.rb:14:in `each'
2019-03-20_18:36:54.79863 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/sidekiq-cron-1.0.4/lib/sidekiq/cron/poller.rb:14:in `enqueue'
2019-03-20_18:36:54.79863 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/sidekiq-5.2.5/lib/sidekiq/scheduled.rb:68:in `block in start'
2019-03-20_18:36:54.79863 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/sidekiq-5.2.5/lib/sidekiq/util.rb:16:in `watchdog'
2019-03-20_18:36:54.79863 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/sidekiq-5.2.5/lib/sidekiq/util.rb:25:in `block in safe_thread'
- From the above trace we found this fugit issue although it states that it was fixed in
1.1.7
(used by Gitlab) it lead us to check our cron expressions, we had:
gitlab_rails['ldap_sync_worker_cron'] = "5 * * * * *"
gitlab_rails['ldap_group_sync_worker_cron'] = "10 * * * * *"
instead of:
gitlab_rails['ldap_sync_worker_cron'] = "5 * * * *"
gitlab_rails['ldap_group_sync_worker_cron'] = "10 * * * *"
Once the cron expression is fixed cpu usage goes back to normal, invalid cron expressions should not cause such behavior.
Steps to reproduce
Add this invalid cron expression on a LDAP enabled instance to gitlab.rb
, restart and wait a few minutes:
gitlab_rails['ldap_sync_worker_cron'] = "5 * * * * *"
Example Project
Not available.
What is the current bug behavior?
The invalid configuration causes high cpu usage instead of reporting an error.
What is the expected correct behavior?
Report the invalid cron expression instead of consuming cpu in an endless loop.
Relevant logs and/or screenshots
Trace:
2019-03-20_18:36:54.79856 2019-03-20T18:36:54.798Z 2977 TID-ot8g7o9m9 WARN: Thread TID-ot89u8ea9 scheduler
2019-03-20_18:36:54.79857 2019-03-20T18:36:54.798Z 2977 TID-ot8g7o9m9 WARN: /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/tzinfo-1.2.5/lib/tzinfo/time_or_datetime.rb:321:in `wrap'
2019-03-20_18:36:54.79857 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/tzinfo-1.2.5/lib/tzinfo/timezone.rb:420:in `utc_to_local'
2019-03-20_18:36:54.79857 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/et-orbi-1.1.7/lib/et-orbi/eo_time.rb:358:in `to_time'
2019-03-20_18:36:54.79858 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/et-orbi-1.1.7/lib/et-orbi/eo_time.rb:203:in `block (2 levels) in <class:EoTime>'
2019-03-20_18:36:54.79858 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/fugit-1.1.7/lib/fugit/cron.rb:76:in `block (2 levels) in <class:TimeCursor>'
2019-03-20_18:36:54.79858 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/fugit-1.1.7/lib/fugit/cron.rb:119:in `sec_match?'
2019-03-20_18:36:54.79858 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/fugit-1.1.7/lib/fugit/cron.rb:252:in `block in previous_time'
2019-03-20_18:36:54.79858 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/fugit-1.1.7/lib/fugit/cron.rb:232:in `loop'
2019-03-20_18:36:54.79859 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/fugit-1.1.7/lib/fugit/cron.rb:232:in `previous_time'
2019-03-20_18:36:54.79859 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/sidekiq-cron-1.0.4/lib/sidekiq/cron/job.rb:561:in `not_past_scheduled_time?'
2019-03-20_18:36:54.79859 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/sidekiq-cron-1.0.4/lib/sidekiq/cron/job.rb:23:in `block in should_enque?'
2019-03-20_18:36:54.79859 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/sidekiq-5.2.5/lib/sidekiq.rb:97:in `block in redis'
2019-03-20_18:36:54.79860 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/connection_pool-2.2.2/lib/connection_pool.rb:65:in `block (2 levels) in with'
2019-03-20_18:36:54.79860 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/connection_pool-2.2.2/lib/connection_pool.rb:64:in `handle_interrupt'
2019-03-20_18:36:54.79860 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/connection_pool-2.2.2/lib/connection_pool.rb:64:in `block in with'
2019-03-20_18:36:54.79860 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/connection_pool-2.2.2/lib/connection_pool.rb:61:in `handle_interrupt'
2019-03-20_18:36:54.79861 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/connection_pool-2.2.2/lib/connection_pool.rb:61:in `with'
2019-03-20_18:36:54.79861 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/sidekiq-5.2.5/lib/sidekiq.rb:94:in `redis'
2019-03-20_18:36:54.79861 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/sidekiq-cron-1.0.4/lib/sidekiq/cron/job.rb:21:in `should_enque?'
2019-03-20_18:36:54.79862 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/sidekiq-cron-1.0.4/lib/sidekiq/cron/job.rb:42:in `test_and_enque_for_time!'
2019-03-20_18:36:54.79862 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/sidekiq-cron-1.0.4/lib/sidekiq/cron/poller.rb:28:in `enqueue_job'
2019-03-20_18:36:54.79862 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/sidekiq-cron-1.0.4/lib/sidekiq/cron/poller.rb:15:in `block in enqueue'
2019-03-20_18:36:54.79863 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/sidekiq-cron-1.0.4/lib/sidekiq/cron/poller.rb:14:in `each'
2019-03-20_18:36:54.79863 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/sidekiq-cron-1.0.4/lib/sidekiq/cron/poller.rb:14:in `enqueue'
2019-03-20_18:36:54.79863 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/sidekiq-5.2.5/lib/sidekiq/scheduled.rb:68:in `block in start'
2019-03-20_18:36:54.79863 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/sidekiq-5.2.5/lib/sidekiq/util.rb:16:in `watchdog'
2019-03-20_18:36:54.79863 /opt/gitlab/embedded/lib/ruby/gems/2.5.0/gems/sidekiq-5.2.5/lib/sidekiq/util.rb:25:in `block in safe_thread'
top process output (90.4 cpu usage):
2977 git 20 0 874248 509296 18592 S 90.4 9.6 603:33.98 sidekiq 5.2.5 gitlab-rails [0 of 25 busy]
Output of checks
Results of GitLab environment info
Expand for output related to GitLab environment info
( System information System: Ubuntu 18.04 Proxy: no Current User: git Using RVM: no Ruby Version: 2.5.3p105 Gem Version: 2.7.6 Bundler Version:1.16.6 Rake Version: 12.3.2 Redis Version: 3.2.12 Git Version: 2.18.1 Sidekiq Version:5.2.5 Go Version: unknown
GitLab information Version: 11.8.2-ee Revision: 4618789681f Directory: /opt/gitlab/embedded/service/gitlab-rails DB Adapter: postgresql DB Version: 9.6.11 URL: https://removed-public-name HTTP Clone URL: https://removed-public-name/some-group/some-project.git SSH Clone URL: git@removed-public-name:some-group/some-project.git Elasticsearch: no Geo: no Using LDAP: yes Using Omniauth: yes Omniauth Providers:
GitLab Shell Version: 8.4.4 Repository storage paths:
- default: /var/opt/gitlab/git-data/repositories Hooks: /opt/gitlab/embedded/service/gitlab-shell/hooks Git: /opt/gitlab/embedded/bin/git )
Results of GitLab application Check
Expand for output related to the GitLab application check
(Checking GitLab subtasks ...
Checking GitLab Shell ...
GitLab Shell: ... GitLab Shell version >= 8.4.4 ? ... OK (8.4.4) Running /opt/gitlab/embedded/service/gitlab-shell/bin/check Check GitLab API access: OK Redis available via internal API: OK
Access to /var/opt/gitlab/.ssh/authorized_keys: OK gitlab-shell self-check successful
Checking GitLab Shell ... Finished
Checking Gitaly ...
Gitaly: ... default ... OK
Checking Gitaly ... Finished
Checking Sidekiq ...
Sidekiq: ... Running? ... yes Number of Sidekiq processes ... 1
Checking Sidekiq ... Finished
Checking Incoming Email ...
Incoming Email: ... Reply by email is disabled in config/gitlab.yml
Checking Incoming Email ... Finished
Checking LDAP ...
LDAP: ... Server: ldapmain not verifying SSL hostname of LDAPS server 'ad-server:389' LDAP authentication... Success LDAP users with access to your GitLab server [...] Checking LDAP ... Finished
Checking GitLab App ...
Git configured correctly? ... yes Database config exists? ... yes All migrations up? ... yes Database contains orphaned GroupMembers? ... no GitLab config exists? ... yes GitLab config up to date? ... yes Log directory writable? ... yes Tmp directory writable? ... yes Uploads directory exists? ... yes Uploads directory has correct permissions? ... yes Uploads directory tmp has correct permissions? ... yes Init script exists? ... skipped (omnibus-gitlab has no init script) Init script up-to-date? ... skipped (omnibus-gitlab has no init script) Projects have namespace: ... 2/1 ... yes Redis version >= 2.8.0? ... yes Ruby version >= 2.3.5 ? ... yes (2.5.3) Git version >= 2.18.0 ? ... yes (2.18.1) Git user has default SSH configuration? ... yes Active users: ... 4 Elasticsearch version 5.6 - 6.x? ... skipped (elasticsearch is disabled)
Checking GitLab App ... Finished
Checking GitLab subtasks ... Finished )
Possible fixes
(Probably an unfixed case of this issue) or simply work around by fixing the cron expressions.