Skip to content

Implement Git trace2 integration

Git bundles a tool called Trace2. This tool enables Git processes to print underlying significant events and corresponding measurements. It collects the activities of child processes as well. Obviously, it adds some overhead to git processes and maybe some risks when passing the events in a file descriptor. It may not too wise to enable Trace2 for all Git processes. There are still good use cases for trace2, if we enable it for a subset of processes:

This MR implements a foundation for trace2 integration. Trace2 is enabled by passing some environment variables to spawning Git processes. The list of environment variables includes:

  • GIT_TRACE2_EVENT. This variable makes Git export instrumentation data in JSON format. It supports dumping the data to stderr, to a file, or to a UNIX socket (more about this). I decided to use a tempfile. This channel is simple and highly isolated. Gitaly creates a tempfile for each Git process and cleans up afterward. The initial Git process passes the same variables to its children processes naturally. This behavior makes them dump events in the same file, leading to easier data collection later. Dumping events to stderr is a no-go. Using a shared UNIX socket for all processes seems like a safer option. However, this complicates data collection and handling.
  • GIT_TRACE2_PARENT_SID: GIT_TRACE2_PARENT_SID is the unique identifier of a process. As PID number is re-used, Git uses SID number to identify the owner of an event. I set it to the correlation ID from the application context. In practice, we don't use this option. If we switch to UNIX socket, this value can be used to split the events.
  • GIT_TRACE2_BRIEF: this env omits debugging information, such as file, line, and even absolute time. The absolute timestamp is stripped and available in key events. The timestamps of other events must be inferred from that key timestamp and relative float time differences.

The events are structured as a flat list of events. They are sorted in chronological order. Each event describes a certain sub operation, including some relative data and metadata. Some events, such as "region_enter" or "cmd_start", indicate a new section in which the consecutive events belong to. Correspondingly, exiting events, such as "region_leave" or "atexit", exit the current section. They look like the following:

{"event":"version","sid":"20190408T191610.507018Z-H9b68c35f-P000059a8","thread":"main","time":"2019-01-16T17:28:42.620713Z","file":"common-main.c","line":38,"evt":"3","exe":"2.20.1.155.g426c96fcdb"}
{"event":"start","sid":"20190408T191610.507018Z-H9b68c35f-P000059a8","thread":"main","time":"2019-01-16T17:28:42.621027Z","file":"common-main.c","line":39,"t_abs":0.001173,"argv":["git","version"]}
{"event":"cmd_name","sid":"20190408T191610.507018Z-H9b68c35f-P000059a8","thread":"main","time":"2019-01-16T17:28:42.621122Z","file":"git.c","line":432,"name":"version","hierarchy":"version"}
{"event":"exit","sid":"20190408T191610.507018Z-H9b68c35f-P000059a8","thread":"main","time":"2019-01-16T17:28:42.621236Z","file":"git.c","line":662,"t_abs":0.001227,"code":0}
{"event":"atexit","sid":"20190408T191610.507018Z-H9b68c35f-P000059a8","thread":"main","time":"2019-01-16T17:28:42.621268Z","file":"trace2/tr2_tgt_event.c","line":163,"t_abs":0.001265,"code":0}

As mentioned in an above section, we want to avoid enabling trace2 if possible. Furthermore, the data is used for different purposes. So, the events are parsed into an internal tree data structure. This tree provides a hooking mechanism. For each purpose, we implement a hook into the tree. The integration as the following diagram:

flowchart TD
  git.CommandFactory --WithTrace2Hooks option--> trace2.Manager
  git.CommandFactory -- Spawns --> command.New
  trace2.Manager -- Injects ENVs --> command.New
  trace2.Manager -- Manages --> trace2hooks.TracingExporter
  trace2.Manager -- Manages --> trace2hooks.PackObjectsMetrics
  command.New -- Spawns --> GitProcess[Git process]
  GitProcess -- Writes to --> Tempfile["/tmp/gitaly-trace-2352"]
  Tempfile -- Collected and parsed --> trace2.Trace
  trace2hooks.TracingExporter -- Handles --> trace2.Trace
  trace2hooks.PackObjectsMetrics -- Handles --> trace2.Trace

Each hook implements two functions: Activated and Handle. The prior function tells the manager whether it needs the data. The manager enables the hook if any of the hooks requires the data. If none of them do, trace2 is not enabled. The later function is triggered when the tree is ready to walk. Here are the drafts of referenced implementations:

I also add some fields to the log if trace2 is enabled. We can use this to track the rollout status. After this MR, as we don't have any hook, it doesn't lead to any changes.

image

Edited by Quang-Minh Nguyen

Merge request reports