Skip to content

Reserve Unavailable response code for service-level unavailability

For https://gitlab.com/gitlab-org/gitaly/-/issues/5149

This MR replaces all ineligible usage of the Unavailable status code. That status code is now reserved for cases that clients are encouraged to retry. The most suitable use cases for this status code is in interceptors or network-related components such as load-balancing. The official documentations differentiate the usage of status codes as the following:

(a) Use UNAVAILABLE if the client can retry just the failing call. (b) Use ABORTED if the client should retry at a higher level (c) Use FAILED_PRECONDITION if the client should not retry until the system state has been explicitly fixed

Multiple places in the source code capture the error from sending a streaming message and wrap it in an Unavailable code. This status code may not be correct because it can raise other less common errors, such as buffer overflow (ResourceExhausted), max message size exceeded (ResourceExhausted), or encoding failure (Internal), etc. The handler should bounce the error up as is instead.

Another typical misused pattern is wrapping spawned process exit code. In many cases, if Gitaly can intercept the exit code or/and error from stderr, it must have a precise error code (InvalidArgument, NotFound, Internal). However, Git processes may exit with exit code 128 and un-parseable stderr. We can intercept it as an operation was rejected because the system is not in a state where it can be executed (resource inaccessible, invalid refs, etc.). FailedPrecondition is the most suitable choice.

This MR also adds a linter to warn occurrences where Unavailable code is used.

Screenshot_2023-06-21_at_15.43.25

Edited by Quang-Minh Nguyen

Merge request reports