Skip to content

client: Add RetryPolicy to default ServiceConfig

Will Chandler (ex-GitLab) requested to merge wc/service-config-retries into master

We have added a gRPC RetryPolicy to the rails client, but held back from doing the same for gitlab-shell and gitlab-workhorse so far. This was out of concern that an in-flight operation might be retried as the response header is only sent at the end of unary operations like SSHUploadPackWithSidechannel.

However, in testing this has not been a problem, so let's go ahead and add retries to our Golang clients.

Note that gRPC does not apply a RetryPolicy when first establishing the connection, as will be the case with a newly forked gitlab-shell process. These will fail immediately unless the WaitForReady option is set, but enabling that will also cause the client to keep the connection open indefinitely. There is no way to wait N seconds for the connection to become available without setting a deadline for the entire request, which is not compatible with the Git operations handled by this client.

Instances using gitlab-sshd will not have this problem during a Gitaly restart as the process will have already established its connection. The same applies for gitlab-workhorse and HTTP operations.

Add a RetryPolicy for the six read-only RPCs exposed to gitlab-shell and gitlab-workhorse using the same parameters as used with rails. This allows up to three retry attempts, starting with a 400ms delay and doubling with each subsequent attempt (plus jitter), resulting in a grace period of roughly 2.4 seconds.

Merge request reports