Praefect with distributed reads cause substantial performance hit on Postgres on Reference Architectures

Quality Enablement have done another performance test run of Gitaly Cluster with Distributed Reads on after #2944 (closed) was fixed.

Performance tests found that on our recommended Reference Architectures, specifically the 10k and 50k, that Praefect still has a significant performance hit against it's Postgres database.

In a nutshell it appears Praefect still requires a heavily specced Postgres node - around 2 to 4 times bigger CPU wise compare to what we currently recommend. Indeed it would appear it would require a bigger specced database than GitLab requires itself.

The test params were as follows:

  • Tests were run against the 10k and 50k environments on version 13.3.0-pre 82d6547b2a5 nightly
  • Praefect is using the same Postgres database as GitLab
  • Distributed reads are on

To show the performance clearly the following is sets of results from the 5t0k environment at different specs:

50k - Postgres n1-standard-16 (current recommendation), Praefect n1-highcpu-2

NAME                                      | RPS    | RPS RESULT           | TTFB AVG  | TTFB P90           | REQ STATUS    | RESULT
------------------------------------------|--------|----------------------|-----------|--------------------|---------------|-----------------
api_v4_projects_repository_files_file_raw | 1000/s | 269.95/s (>800.00/s) | 3094.94ms | 5786.45ms (<500ms) | 99.52% (>95%) | FAILED²

image

50k - Postgres n1-standard-32, Praefect n1-highcpu-4

NAME                                      | RPS    | RPS RESULT           | TTFB AVG  | TTFB P90           | REQ STATUS    | RESULT
------------------------------------------|--------|----------------------|-----------|--------------------|---------------|-----------------
api_v4_projects_repository_files_file_raw | 1000/s | 513.17/s (>800.00/s) | 1719.65ms | 3677.99ms (<500ms) | 99.96% (>95%) | FAILED²

image

50k - Postgres n1-standard-64, Praefect n1-highcpu-8

NAME                                      | RPS    | RPS RESULT           | TTFB AVG | TTFB P90          | REQ STATUS     | RESULT
------------------------------------------|--------|----------------------|----------|-------------------|----------------|-----------------
api_v4_projects_repository_files_file_raw | 1000/s | 853.52/s (>800.00/s) | 615.25ms | 998.15ms (<500ms) | 100.00% (>95%) | FAILED²

image

Full test results can be seen here - gitlab-org/quality/performance#252 (comment 396930902)

Edited by Grant Young
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information