Skip to content

WIP: Add additional check to make sure bot hasn't been previously deleted before deleting.

Arber X requested to merge arber/safe_access into master

Description

It could be the case that the client sends multiple TERMINATING updateBotSession requests. If this is the case, then buildgrid will fail due to having previously deleting the bot id.

This simple change simply ignores the request if the instance_name isn't found in the bots dict.

Edit: This could need further investigating, what happens to the lease if the server gets multiple TERMINATING statuses from the worker?

Example error:

2019-07-19 16:12:32,642:[                        grpc._server][ERROR]: Exception calling application: '3687ac7e-ff7e-4e61-8470-0a79bd4ec462'
Traceback (most recent call last):
  File "/buildgrid/0.0.2~20190606-1+b20190715T09030575/python3.6-buildgrid/lib/python3.6/site-packages/grpc/_server.py", line 390, in _call_behavior
    return behavior(argument, context), True
  File "/buildgrid/0.0.2~20190606-1+b20190715T09030575/python3.6-buildgrid/lib/python3.6/site-packages/buildgrid/server/_authentication.py", line 104, in __authorize_wrapper
    return behavior(self, request, context)
  File "/buildgrid/0.0.2~20190606-1+b20190715T09030575/python3.6-buildgrid/lib/python3.6/site-packages/buildgrid/server/bots/service.py", line 145, in UpdateBotSession
    request.bot_session)
  File "/buildgrid/0.0.2~20190606-1+b20190715T09030575/python3.6-buildgrid/lib/python3.6/site-packages/buildgrid/server/bots/instance.py", line 107, in update_bot_session
    checked_lease = self._check_lease_state(lease)
  File "/buildgrid/0.0.2~20190606-1+b20190715T09030575/python3.6-buildgrid/lib/python3.6/site-packages/buildgrid/server/bots/instance.py", line 174, in _check_lease_state
    self._scheduler.update_job_lease_state(lease.id, lease)
  File "/buildgrid/0.0.2~20190606-1+b20190715T09030575/python3.6-buildgrid/lib/python3.6/site-packages/buildgrid/server/scheduler.py", line 387, in update_job_lease_state
    self._update_job_operation_stage(job_name, operation_stage)
  File "/buildgrid/0.0.2~20190606-1+b20190715T09030575/python3.6-buildgrid/lib/python3.6/site-packages/buildgrid/server/scheduler.py", line 605, in _update_job_operation_stage
    job = self.__jobs_by_name[job_name]
KeyError: '3687ac7e-ff7e-4e61-8470-0a79bd4ec462'
Edited by Arber X

Merge request reports