Skip to content

Fix quirky problems with tracing

Quang-Minh Nguyen requested to merge qmnguyen0711/fix-git-cat-file-traces into master

A part of For #4762 (closed)

Problems

In a prior MR, I added many improvements to Gitaly tracing infrastructure. The result looks great (captured in this comment). However, there are still two notable issues:

  • Some spans are still orphaned. There should be some entry points where the root span/trace is not enabled.

Screenshot_2023-02-20_at_18.08.31

  • In many occurrences, git-cat-file (or similar git-command spans) is longer than its parent span. After debugging, it turns out the spans for catfile caching are shared between goroutines. Those spans should not be attributed as a part of the goroutine starting the process. We don't even need to measure the process.

Screenshot_2023-02-20_at_18.11.46

Solutions

The first problem is not hard to fix. We can find all the entry points to add the root spans, like what we did in the prior MR. This way doesn't scale, though. If someone creates a new entry point, the orphaned situation shows up again. We need a way to initialize a span only if the context already includes a span. To fix the second problem, we'll need a way to tell the command to not start a span if the process is not cached. The cache management should take care of this responsibility.

Fixing two problems requires some additions to the functionalities that opentracing library and labkit library. So, I decided to implement a thin wrapper wrapping around the underlying libraries. This wrapper provides span validation, as I mentioned above. In addition, it provides some helpful features, such as discard a span branch (fixing the second problem). We'll enforce all tracing creation must go through the interfaces of this wrapper.

Merge request reports