Skip to content

packed_binaries: Extract binaries in parallel

Will Chandler (ex-GitLab) requested to merge wc/parallel-extract into master

Currently we extract each Gitaly's packed binaries serially. This task consumes a majority of the time spent during initialization. By extracting these files in parallel we can measureably reduce startup time.

On a 4-core system this improves startup time by ~10%, using the same benchmarking process as 96438c24 (gitaly: Don't block on preloading licensedb, 2023-09-20):

  Benchmark 1: ./gitaly-par serve config.toml
    Time (mean ± σ):     228.5 ms ±   4.2 ms    [User: 285.8 ms, System: 76.8 ms]
    Range (min … max):   222.7 ms … 237.0 ms    13 runs

  Benchmark 2: ./gitaly-st serve config.toml
    Time (mean ± σ):     254.5 ms ±   6.9 ms    [User: 315.1 ms, System: 75.6 ms]
    Range (min … max):   246.7 ms … 272.2 ms    11 runs

  Summary
    ./gitaly-par serve config.toml ran
      1.11 ± 0.04 times faster than ./gitaly-st serve config.toml

On a 16-core system this improves to ~20%:

  Benchmark 1: ./gitaly-par serve config.toml
    Time (mean ± σ):     234.7 ms ±   6.0 ms    [User: 326.4 ms, System: 169.5 ms]
    Range (min … max):   228.6 ms … 247.5 ms    12 runs

  Benchmark 2: ./gitaly-st serve config.toml
    Time (mean ± σ):     282.9 ms ±  10.4 ms    [User: 377.1 ms, System: 156.3 ms]
    Range (min … max):   266.3 ms … 302.0 ms    11 runs

  Summary
    './gitaly-par serve config.toml' ran
      1.21 ± 0.05 times faster than './gitaly-st serve config.toml'

This does place more demand on the disk, but only momentarily. When the host is under heavy io pressure, simulated here with stress-ng --iomix 5 parallel extraction's performance advantage is extended to ~36% on the 16-core system:

  Benchmark 1: ./gitaly-par serve config.toml
    Time (mean ± σ):     545.9 ms ± 159.9 ms    [User: 731.7 ms, System: 177.7 ms]
    Range (min … max):   326.9 ms … 913.7 ms    10 runs

  Benchmark 2: ./gitaly-st serve config.toml
    Time (mean ± σ):     740.9 ms ± 242.8 ms    [User: 977.4 ms, System: 150.7 ms]
    Range (min … max):   378.9 ms … 1029.0 ms    10 runs

  Summary
    './gitaly-par serve config.toml' ran
      1.36 ± 0.60 times faster than './gitaly-st serve config.toml'
Edited by Will Chandler (ex-GitLab)

Merge request reports