Skip to content

Catch RpcErrors when ByteStream clients disconnect mid-write

Adam Coldrick requested to merge sotk/bugs/catch-bytestream-disconnect into master

Description

This updates the ByteStream Write implementation to catch exceptions raised when iterating the requests in the stream. This solves a situation where a client disconnecting mid-stream raises an unhandled Exception, putting pointless noise in the logs.

Validation

Run a BuildGrid, upload a large blob to it, and cancel halfway through. For example,

docker-compose up --build --detach
head -c 1G /dev/urandom > in.txt
tox -e venv -- bgd cas upload-file in.txt
# interrupt before the upload is done

A small patch to make it more obvious when the upload is happening:

diff --git a/buildgrid/client/cas.py b/buildgrid/client/cas.py
index 21bd2fb0..2dea5ab7 100644
--- a/buildgrid/client/cas.py
+++ b/buildgrid/client/cas.py
@@ -918,7 +918,10 @@ class Uploader:
             offset = 0
             finished = False
             remaining = blob_digest.size_bytes - offset
+            req_count = remaining // min(remaining, MAX_REQUEST_SIZE)
+            req_no = 0
             while not finished:
+                req_no += 1
                 chunk_size = min(remaining, MAX_REQUEST_SIZE)
                 remaining -= chunk_size
 
@@ -928,6 +931,7 @@ class Uploader:
                 request.write_offset = offset
                 request.finish_write = remaining <= 0
 
+                print(f"yielding request {req_no}/{req_count}")
                 yield request
 
                 offset += chunk_size

Merge request reports

Loading