Fix BotsService not being stopped gracefully
Description
This MR attempts to fix the issue that BotsService cannot be stopped gracefully via SIGTERM
.
How to reproduce it?
Compose the following components
-
BuildGrid
withbots
- One
buildbox-worker
connected to it - Setting
keep-alive
as a large value, e.g. 5 mins - Don't submit any execution
- Send
SIGTERM
toBuildGrid
, i.e.kill
Root cause
A worker periodically polls BotsService
via UpdateBotsSession
GRPC call. However, in our implementation, the call waits if there isn't a lease immediately available and the GRPC thread sleeps while waiting for the job.
Fix
This fix wakes up the GRPC threads of UpdateBotsSession
and effectively tells the worker to cancel the session. This makes sense since the service is being shut down.
Edited by Zehao Chen