Skip to content

Add forceful shutdown when SIGTERM is recieved

Jeremiah Bonney requested to merge jbonney/sigint-vs-sigterm into master

Description

When buildbox-worker receives SIGINT or SIGTERM signal it starts to gracefully shutdown. What that means is that it notifies BuildGrid that it's exiting to prevent it from getting more work and waits for any current ongoing leases to finish. If buildbox-worker is processing a long-running job this might mean it takes a very long time before it actually exits, and using a SIGKILL will result in the runner processes being orphaned.

To help alleviate this buildbox-worker will now know how to do a "forceful shutdown" when receiving a SIGTERM, where it will immediately shut down any runner processes with SIGTERM and exit. It also will gracefully handle starting off with a SIGINT and then receiving a SIGTERM some time later.

This PR is now built on top of !117 (merged), which does most of the heavy lifting in allowing calls to be interrupted and responding promptly to signals.

Validation

I'm working on a new e2e test for this behavior, but you can also verify it by sending a sleep SOME_LARGE_NUMBER job to be executed and then sending a SIGTERM to buildbox-worker.

Edited by Jeremiah Bonney

Merge request reports