Skip to content

Draft: Introduce snapshot statistics

This code change adds support for tracking "snapshot" data in Gitaly JSON logs, which provides additional performance metrics about file system operations. The main improvements include:

New snapshot tracking: The system now captures snapshot information (file counts, directory counts, and duration) from Gitaly logs when available, storing this data separately from regular performance metrics.

Enhanced log parsing: A new log type "GitalyJsonDetail" was added that can handle both basic Gitaly events and those with snapshot data. The system now defaults to using this enhanced parser for all Gitaly JSON logs.

New statistics fields: Several new fields were added to display snapshot-related statistics like maximum/minimum snapshot durations, 95th/99th percentiles, and median file/directory counts. These appear as new columns in the output.

Improved data collection: The system now processes "info" level events for snapshot data collection while still excluding them from main performance statistics, ensuring snapshot metrics are captured even from informational log entries.

Default field changes: The default output now includes snapshot duration statistics (95th percentile) alongside existing performance metrics.

These changes allow users to get deeper insights into Gitaly's file system performance characteristics, particularly useful for understanding repository operation patterns and identifying potential bottlenecks in Git operations.

fast-stats gitaly.log --sort-by snapshot-count

METHOD                       COUNT     RPS    P99_ms    P95_ms   MEDIAN_ms    MAX_ms    MIN_ms    STDDEV      SCORE     %FAIL  SNAP_DUR_P95_ms
WriteRef                      1536   25.60    1621.1    1158.1       562.6    2280.2      59.8     384.8  2490053.1      1.63            38.96
DeleteRefs                    1512   25.20    1728.1    1206.1       703.5    3124.2     150.2     383.0  2612884.4      0.86            35.97
FindCommit                    3189   53.15     972.7     903.2       713.1    1472.9      12.8     178.2  3101881.5      0.09            38.38
ListCommitsByOid              3195   53.25     979.8     907.0       718.1    1211.8       7.0     182.3  3130414.0      0.53            40.22
TreeEntry                     1896   31.60    1603.1    1469.9      1231.2    1838.5      20.4     314.2  3039383.9      1.21            40.18
GetBlobs                      1751   29.18    1789.8    1666.2      1378.1    2127.0      21.5     377.6  3133897.8      0.74            36.63
GetTreeEntries                1659   27.65    1929.8    1785.6      1451.3    2249.6      32.8     388.7  3201529.9      1.87            40.29
ReferenceTransactionHook      7594  126.57      25.5      16.1         1.2      49.5       0.2       5.9   193884.2      0.00             0.00

Example of filtering by fields

 ./target/debug/fast-stats gitaly.log --print-fields snapshot-count,snapshot-file-ct,snapshot-dir-ct,snapshot-duration-p95,snapshot-duration-max,snapshot-duration-min
METHOD                    SNAP_COUNT        FILE_CT_MED        DIR_CT_MED  SNAP_DUR_P95_ms  SNAP_DUR_MAX_ms  SNAP_DUR_MIN_ms
GetTreeEntries                   340                118                12            40.29            69.29             2.06
GetBlobs                         367                122                12            36.63            71.58             2.03
ListCommitsByOid                 647                124                12            40.22            77.94             1.92
FindCommit                       676                 52                12            38.38            64.19             1.97
TreeEntry                        434                122                12            40.18            69.49             1.88
DeleteRefs                      1512                117                12            35.97            83.99             1.64
WriteRef                        1533                117                12            38.96            77.77             1.63
ReferenceTransactionHook           0                  0                 0             0.00             0.00             0.00

Options have been updated with new fields

Options:
  -j, --thread-ct <THREAD_CT>
          Number of threads to spawn. Defaults to 'nproc'
  -c, --compare <COMPARE_FILE>
          Calculate performance changes between two files
  -C, --color-output
          Output colored text even if writing to a pipe or file
  -f, --format <FORMAT>
          Format to print in [default: text] [possible values: text, md, csv, json]
  -i, --interval <INTERVAL_LEN>
          Split results into INTERVAL_LEN time slices. INTERVAL_LEN must be in <LEN>[h|m|s] format
  -l, --limit <LIMIT_CT>
          Number of results to display
  -t, --type <LOG_TYPE>
          Manually specify log type of <INPUT> if log type cannot be deduced automatically [possible values: api, api_dur_ms, gitaly, gitaly_unstructured, production, production_dur_ms, sidekiq, sidekiq_dur_ms, sidekiq_unstructured]
  -p, --print-fields [<PRINT_FIELDS>...]
          List of fields to be printed [default: count rps p99 p95 median max min std-dev score fail snapshot-duration-p95] [possible values: count, rps, p99, p95, median, max, min, score, fail, std-dev, snapshot-duration-max, snapshot-duration-min, snapshot-duration-p95, snapshot-duration-p99, snapshot-count, snapshot-file-ct, snapshot-dir-ct]
  -S, --search <SEARCH_FOR>
          Case-insensitive search of controller/method/worker field
  -s, --sort-by <SORT_BY>
          Field to sort by descending value [default: score] [possible values: count, rps, p99, p95, median, max, min, score, fail, std-dev, snapshot-duration-max, snapshot-duration-min, snapshot-duration-p95, snapshot-duration-p99, snapshot-count, snapshot-file-ct, snapshot-dir-ct]
  -v, --verbose
          Print details of p99, p95, and median events
  -g, --split-graphql
          Split GraphQL operations by operation name in output
  -h, --help
          Print help
  -V, --version
          Print version

Example of a compare with another log

./target/debug/fast-stats gitaly.log --compare test_snapshot.log --print-fields snapshot-count,snapshot-file-ct,snapshot-dir-ct,snapshot-duration-p95
FILE               METHOD    SNAP_COUNT        FILE_CT_MED        DIR_CT_MED  SNAP_DUR_P95_ms
gitaly.log         WriteRef        1533                117                12            38.96
test_snapshot.log                     2                224                67             4.86
ratio                             0.00x              1.91x             5.58x            0.12x

Merge request reports

Loading