Skip to content

Make distributed tracing useful to Gitaly

Quang-Minh Nguyen requested to merge qmnguyen0711/improve-tracing into master

For #4762 (comment 1260932936)

When debugging some performance problems on production, we realized the distributed tracing is now enabled on production. However, it has some major problems, making it not very useful. I think it's a low-hanging fruit. With some minor safe changes, the tool can become a valuable tool to debug an issue on production. It's a great addition to existing metrics and logs toolbox.

This MR adds a series of changes:

  • Enhance command span, resolve the following problems:
    • Simplify span command. Before, we use full command path as the operation span. It creates tons of noises and makes it impossible to search. The new version simplifies it to git-diff, git-rev-parse, tar, du, etc.
    • Attach command result as span stags. The prior version logs command results as detached logs. Depending on platforms, the logs are collected differently. Worse, they may be rejected. Attaching them as tags improves its readability.
  • Fix orphaned spans from catfile cache. Previously, we create a span before issuing catfile.getOrCreateProcess. This span finishes when the process exists. This process cache intends to share the process between requests. Its lifecycle spans between different requests, and may last for several seconds/minutes. As a result, the original span never ends. Eventually, it is sent to tracing server and becomes an orphaned span. This situation adds a lot of noises.
  • Fix orphaned spans while Gitaly boosts.
  • Add more spans to key modules, such as internal/middleware/limithandler, internal/git/housekeeping

Some screenshots to demonstrate the changes:

Before

Screenshot_2023-02-03_at_11.05.02

Screenshot_2023-02-03_at_11.01.48

After

Screenshot_2023-02-03_at_11.40.29

Screenshot_2023-02-03_at_13.24.34

Screenshot_2023-02-03_at_10.40.53

Screenshot_2023-02-07_at_13.15.08

Screenshot_2023-02-07_at_13.15.41

Edited by Quang-Minh Nguyen

Merge request reports