SIGHUP to Unicorn reruns initialization code, crashes unicorn
Sending a SIGHUP signal to the parent unicorn process on GDK will crash the process.
This is because Gitlab::Cluster::LifecycleEvents.on_before_fork
is expected to run once in the lifetime of the process, but on SIGHUP
it is run a second time.
This leads to any code that expects a single initialization to crash. The first place this happens is here:
| E, [2019-07-18T22:30:54.224728 #13913] ERROR -- : Gitlab::Metrics::Samplers::UnicornSampler singleton instance already initialized (RuntimeError)
22:30:54 rails-web.1 | /Users/andrewn/code/gitlab/gitlab-development-kit/gitlab/lib/gitlab/daemon.rb:6:in `initialize_instance'
22:30:54 rails-web.1 | /Users/andrewn/code/gitlab/gitlab-development-kit/gitlab/config/initializers/7_prometheus_metrics.rb:49:in `block in <main>'
I suspected that the problem is that state in unicorn.rb
is not preserved across SIGHUP signals, leading to the run_once
variable being reset on SIGHUP
. See https://github.com/defunkt/unicorn/blob/d03dd4e9e4ff29689752b7c82202008fefaf1210/lib/unicorn/http_server.rb#L764-L779
This hypothesis proved correct, by switching to a global variable to problem stopped happening. This is a horrible solution, but I've only done it to prove the point that state is not being preserved.
unless defined?($run_once)
$run_once = true
end
before_fork do |server, worker|
if $run_once
# There is a difference between Puma and Unicorn:
# - Puma calls before_fork once when booting up master process
# - Unicorn runs before_fork whenever new work is spawned
# To unify this behavior we call before_fork only once (we use
# this callback for deleting Prometheus files so for our purposes
# it makes sense to align behavior with Puma)
$run_once = false
# Signal application hooks that we're about to fork
Gitlab::Cluster::LifecycleEvents.do_before_fork
end
This crash may or may not be interfering with our production HUPs: https://gitlab.com/gitlab-org/gitlab-ce/issues/64741#note_193462664 further investigation is need to confirm this though.
Related changes: omnibus-gitlab!3357 (merged) https://gitlab.com/gitlab-org/gitlab-ce/issues/57052.
Recreating the issue
This issue can be recreated with a fresh version of unicorn.rb
:
~/code/gitlab/gitlab-development-kit $ git pull && rm gitlab/config/unicorn.rb && make gitlab/config/unicorn.rb
Start GDK and run the following after the application has started
kill -HUP $(pgrep -f 'unicorn_rails master')
cc @jarv @jprovaznik
Slack thread: https://gitlab.slack.com/archives/C2Z9A056E/p1563520239309500 (temporary)