WIP: Add additional check to make sure bot hasn't been previously deleted before deleting.
Description
It could be the case that the client sends multiple TERMINATING
updateBotSession requests. If this is the case, then buildgrid will fail due to having previously deleting the bot id.
This simple change simply ignores the request if the instance_name
isn't found in the bots
dict.
Edit:
This could need further investigating, what happens to the lease if the server gets multiple TERMINATING
statuses from the worker?
Example error:
2019-07-19 16:12:32,642:[ grpc._server][ERROR]: Exception calling application: '3687ac7e-ff7e-4e61-8470-0a79bd4ec462'
Traceback (most recent call last):
File "/buildgrid/0.0.2~20190606-1+b20190715T09030575/python3.6-buildgrid/lib/python3.6/site-packages/grpc/_server.py", line 390, in _call_behavior
return behavior(argument, context), True
File "/buildgrid/0.0.2~20190606-1+b20190715T09030575/python3.6-buildgrid/lib/python3.6/site-packages/buildgrid/server/_authentication.py", line 104, in __authorize_wrapper
return behavior(self, request, context)
File "/buildgrid/0.0.2~20190606-1+b20190715T09030575/python3.6-buildgrid/lib/python3.6/site-packages/buildgrid/server/bots/service.py", line 145, in UpdateBotSession
request.bot_session)
File "/buildgrid/0.0.2~20190606-1+b20190715T09030575/python3.6-buildgrid/lib/python3.6/site-packages/buildgrid/server/bots/instance.py", line 107, in update_bot_session
checked_lease = self._check_lease_state(lease)
File "/buildgrid/0.0.2~20190606-1+b20190715T09030575/python3.6-buildgrid/lib/python3.6/site-packages/buildgrid/server/bots/instance.py", line 174, in _check_lease_state
self._scheduler.update_job_lease_state(lease.id, lease)
File "/buildgrid/0.0.2~20190606-1+b20190715T09030575/python3.6-buildgrid/lib/python3.6/site-packages/buildgrid/server/scheduler.py", line 387, in update_job_lease_state
self._update_job_operation_stage(job_name, operation_stage)
File "/buildgrid/0.0.2~20190606-1+b20190715T09030575/python3.6-buildgrid/lib/python3.6/site-packages/buildgrid/server/scheduler.py", line 605, in _update_job_operation_stage
job = self.__jobs_by_name[job_name]
KeyError: '3687ac7e-ff7e-4e61-8470-0a79bd4ec462'
Edited by Arber X