Disambiguate concurrency limit conditions "maximum time in concurrency queue reached" "maximum queue size reached"

This came up in gitlab-com/gl-infra/production#7484 (closed).

We currently treat both concurrency limiting cases as an Unavailable error, which is currently ignored by the SLO. They behave in very different ways however. Maximum time in concurrency queue indicates that a single RPC waited for too long. Whereas maximum queue size reached indicates that the entire queue is full and all incoming requests are being dropped.

The latter condition potentially has a much broader effect, and indicates either: many long-running clients clogging the queue, or a mis-configured queue size that does not provide adequate buffer.

At the very least, it'd be useful to be able to disambiguate these cases in our metrics. Currently they're all lumped together in gitaly_service_client_requests_total{grpc_code="Unavailable",grpc_method=~"PostUploadPackWithSidechannel|SSHUploadPackWithSidechannel"}.

Edited Jul 21, 2022 by Igor

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information