Skip to content

Return Canceled gRPC code if Sidechannel client hangs up

Closes gitlab-com/gl-infra/scalability#1302 (closed)

With sidechannel protocol, we open connections to transfer data inside RPC handler. Those connecitons are transperent to gRPC. The gRPC servers are not aware of or control those connections. Therefore, there are chances that the sidechannel connections are cancelled, but the returned status code is Unknown. This MR is to wrap such errors into Canceled codes.

One way of doing this is to capture yamux specific errors in internal/middleware/cancelhandler/cancelhandler.go middleware. There are two errors for this scenario:

  • When the client called Close(), flagFIN is sent to the server.
    • The server starts force close timer, but is still able to write
    • The client starts a force close timer
  • When the timers are due, any read/write operations return ErrStreamClosed. They send flagRST flag to the other side.
  • When either side receives flagRST, any read/write operations raise ErrConnectionReset return error.

Apart from being meaninful for this issue, backchannel and sidechannel options also opens a way for us to customize the yamux configuration in case the recent hard-coded configuration is not suitable for sidechannel. The config set is optimized for Praefect-Gitaly single connection now.

Note that the configurations are independent between clients and severs. When a client or server opens a stream to another side, a stream struct from each side is created. Any read/write operation uses the local configurations. In sidechannel protocol, we open new streams from both clients and servers. It means we have to set the configuration in both sides and keep them in-sync.

Edited by Quang-Minh Nguyen

Merge request reports