Support: Add benchmarking Ansible script (!5206) · Merge requests · GitLab.org / gitaly

Will Chandler (ex-GitLab) requested to merge wc/benchmarking into master Dec 20, 2022

Currently Gitaly running on GitLab.com is well-instrumented and allows us to generally catch major performance regressions quickly. However, small regressions are likely to sneak through due to the inherently noise nature of user traffic. To get well-defined measures of performance we will need a separate environment running consistent workloads against immutable repositories.

To begin addressing this need, this commit add an Ansible script that will create a standalone Gitaly node and a client node from which to send traffic. We are not using Omnibus GitLab so that arbitrary Gitaly revisions may benchmarked.

We use ghz, a gRPC benchmarking tool, to send traffic to Gitaly. This supports streaming RPCs, unlike k6s, which the Quality team uses for GPT. The ghz logs and Gitaly logs are included in the results archive.

By default the Gitaly host will be profiled using perf and several libbpf-tools binaries. Testing shows this slows average response time by ~10%. This can be disabled by setting the profile variable to false.

This initial MVC support only three RPCs are support, FindCommit, GetBlobs, and ListCommitsByOid. These are among the most commonly used RPCs in real traffic and are simple queries.

Five open source repositories are used:

git/git: a relatively small repository and well-maintained repository
gitlab-org/gitlab: very large number of references and a pool repo
torvalds/linux: larger repository
homebrew/homebrew-core: very large number of tree objects
google/chromium: very large, large objects in repo, many refs

Note that currently we are not setting gitaly-session-id on rquests, so the catfile cache is not being used and results will not match production environments.

Support: Add benchmarking Ansible script

Merge request reports