Skip to content

Improve latency attribution observability in gitaly

This came up in the context of investigating https://gitlab.com/gitlab-com/gl-infra/production/-/issues/8350.

For rails, we have plenty of per-request timers in the logs. This makes latency attribution easy, as we can see in which subsystems a request spent its time (postgres, redis, gitaly, external http).

We have some of this in gitaly by means of command stats. But there are still plenty of latency sources that are unaccounted for and difficult to attribute:

While we do have metrics, we're missing this information in the logs. By enriching the logs with additional measurements, it will become easier to investigate and improve slow RPCs in gitaly.