Improve error propagation from gRPC client calls
Description
If an error occurs when a BuildGrid server calls a remote gRPC server (e.g. execution server fetching message from CAS server), either a RpcError
(for UNAVAILABLE
and ABORTED
, after exhausting retries) or a ConnectionError
is raised. There is no catch clause for either of these errors in service.py
, which means that the client of the BuildGrid server will receive an INTERNAL
error, sometimes even without any error details.
However, in many cases it would be preferable to propagate the gRPC status code from the remote server to the client to not miss error information in client logs and allow the client to handle selected status code differently (e.g. it may affect the retry decision).
Changes proposed in this merge request:
-
client
: MoveGrpcRetrier
to a separate module -
client/retrier.py
: Map gRPC errors to suitable exceptions inGrpcRetrier
- server (
service.py
): MapConnectionError
toUNAVAILABLE
Validation
A few exception checks in the test suite have been updated and behave as expected. No integration tests have been added for this.