Add forceful shutdown when SIGTERM is recieved
Description
When buildbox-worker
receives SIGINT
or SIGTERM
signal it starts to gracefully shutdown. What that means is that it notifies BuildGrid that it's exiting to prevent it from getting more work and waits for any current ongoing leases to finish. If buildbox-worker
is processing a long-running job this might mean it takes a very long time before it actually exits, and using a SIGKILL
will result in the runner processes being orphaned.
To help alleviate this buildbox-worker
will now know how to do a "forceful shutdown" when receiving a SIGTERM
, where it will immediately shut down any runner processes with SIGTERM
and exit. It also will gracefully handle starting off with a SIGINT
and then receiving a SIGTERM
some time later.
This PR is now built on top of !117 (merged), which does most of the heavy lifting in allowing calls to be interrupted and responding promptly to signals.
Validation
I'm working on a new e2e test for this behavior, but you can also verify it by sending a sleep SOME_LARGE_NUMBER
job to be executed and then sending a SIGTERM
to buildbox-worker
.