Draft: Introduce snapshot statistics
This code change adds support for tracking "snapshot" data in Gitaly JSON logs, which provides additional performance metrics about file system operations. The main improvements include:
New snapshot tracking: The system now captures snapshot information (file counts, directory counts, and duration) from Gitaly logs when available, storing this data separately from regular performance metrics.
Enhanced log parsing: A new log type "GitalyJsonDetail" was added that can handle both basic Gitaly events and those with snapshot data. The system now defaults to using this enhanced parser for all Gitaly JSON logs.
New statistics fields: Several new fields were added to display snapshot-related statistics like maximum/minimum snapshot durations, 95th/99th percentiles, and median file/directory counts. These appear as new columns in the output.
Improved data collection: The system now processes "info" level events for snapshot data collection while still excluding them from main performance statistics, ensuring snapshot metrics are captured even from informational log entries.
Default field changes: The default output now includes snapshot duration statistics (95th percentile) alongside existing performance metrics.
These changes allow users to get deeper insights into Gitaly's file system performance characteristics, particularly useful for understanding repository operation patterns and identifying potential bottlenecks in Git operations.
fast-stats gitaly.log --sort-by snapshot-count
METHOD COUNT RPS P99_ms P95_ms MEDIAN_ms MAX_ms MIN_ms STDDEV SCORE %FAIL SNAP_DUR_P95_ms
WriteRef 1536 25.60 1621.1 1158.1 562.6 2280.2 59.8 384.8 2490053.1 1.63 38.96
DeleteRefs 1512 25.20 1728.1 1206.1 703.5 3124.2 150.2 383.0 2612884.4 0.86 35.97
FindCommit 3189 53.15 972.7 903.2 713.1 1472.9 12.8 178.2 3101881.5 0.09 38.38
ListCommitsByOid 3195 53.25 979.8 907.0 718.1 1211.8 7.0 182.3 3130414.0 0.53 40.22
TreeEntry 1896 31.60 1603.1 1469.9 1231.2 1838.5 20.4 314.2 3039383.9 1.21 40.18
GetBlobs 1751 29.18 1789.8 1666.2 1378.1 2127.0 21.5 377.6 3133897.8 0.74 36.63
GetTreeEntries 1659 27.65 1929.8 1785.6 1451.3 2249.6 32.8 388.7 3201529.9 1.87 40.29
ReferenceTransactionHook 7594 126.57 25.5 16.1 1.2 49.5 0.2 5.9 193884.2 0.00 0.00
Example of filtering by fields
./target/debug/fast-stats gitaly.log --print-fields snapshot-count,snapshot-file-ct,snapshot-dir-ct,snapshot-duration-p95,snapshot-duration-max,snapshot-duration-min
METHOD SNAP_COUNT FILE_CT_MED DIR_CT_MED SNAP_DUR_P95_ms SNAP_DUR_MAX_ms SNAP_DUR_MIN_ms
GetTreeEntries 340 118 12 40.29 69.29 2.06
GetBlobs 367 122 12 36.63 71.58 2.03
ListCommitsByOid 647 124 12 40.22 77.94 1.92
FindCommit 676 52 12 38.38 64.19 1.97
TreeEntry 434 122 12 40.18 69.49 1.88
DeleteRefs 1512 117 12 35.97 83.99 1.64
WriteRef 1533 117 12 38.96 77.77 1.63
ReferenceTransactionHook 0 0 0 0.00 0.00 0.00
Options have been updated with new fields
Options:
-j, --thread-ct <THREAD_CT>
Number of threads to spawn. Defaults to 'nproc'
-c, --compare <COMPARE_FILE>
Calculate performance changes between two files
-C, --color-output
Output colored text even if writing to a pipe or file
-f, --format <FORMAT>
Format to print in [default: text] [possible values: text, md, csv, json]
-i, --interval <INTERVAL_LEN>
Split results into INTERVAL_LEN time slices. INTERVAL_LEN must be in <LEN>[h|m|s] format
-l, --limit <LIMIT_CT>
Number of results to display
-t, --type <LOG_TYPE>
Manually specify log type of <INPUT> if log type cannot be deduced automatically [possible values: api, api_dur_ms, gitaly, gitaly_unstructured, production, production_dur_ms, sidekiq, sidekiq_dur_ms, sidekiq_unstructured]
-p, --print-fields [<PRINT_FIELDS>...]
List of fields to be printed [default: count rps p99 p95 median max min std-dev score fail snapshot-duration-p95] [possible values: count, rps, p99, p95, median, max, min, score, fail, std-dev, snapshot-duration-max, snapshot-duration-min, snapshot-duration-p95, snapshot-duration-p99, snapshot-count, snapshot-file-ct, snapshot-dir-ct]
-S, --search <SEARCH_FOR>
Case-insensitive search of controller/method/worker field
-s, --sort-by <SORT_BY>
Field to sort by descending value [default: score] [possible values: count, rps, p99, p95, median, max, min, score, fail, std-dev, snapshot-duration-max, snapshot-duration-min, snapshot-duration-p95, snapshot-duration-p99, snapshot-count, snapshot-file-ct, snapshot-dir-ct]
-v, --verbose
Print details of p99, p95, and median events
-g, --split-graphql
Split GraphQL operations by operation name in output
-h, --help
Print help
-V, --version
Print version
Example of a compare with another log
./target/debug/fast-stats gitaly.log --compare test_snapshot.log --print-fields snapshot-count,snapshot-file-ct,snapshot-dir-ct,snapshot-duration-p95
FILE METHOD SNAP_COUNT FILE_CT_MED DIR_CT_MED SNAP_DUR_P95_ms
gitaly.log WriteRef 1533 117 12 38.96
test_snapshot.log 2 224 67 4.86
ratio 0.00x 1.91x 5.58x 0.12x