Skip to content

Introduction of in-memory cache for reads distribution

Pavlo Strokov requested to merge ps-up-to-date-storages-cache into master

With enabled distributed_reads feature each read operation leads to a database query execution to get state of the storages for particular repository. More read calls leads to more database access opearions, so the pressure to it increases in linear (or even worse). To mitigate this problem it was decided to introduce an in-memory cache added before accessing the database. The expiration of the entries is defaulted to 1min and invalidation happens on receiving events from the database. The events are send by the triggers attached to the repositories (delete operations) and storage_repositories (insert, delete, update) tables.

To monitor the cache a new counter was added: gitaly_praefect_uptodate_storages_cache_access_total. It tracks amount of cache hits, misses and populates and evicts per virtual repository.
To check if cache was disabled the message could be searched in the logs: received payload can't be processed, cache disabled

Closes: #3053 (closed)

The example of the new set of metrics:

  • Distribution of read operations across different nodes:
    gitaly_feature_flag_checks_total{enabled="true",flag="distributed_reads"} 157
    gitaly_praefect_read_distribution{storage="praefect-internal-0",virtual_storage="default"} 50
    gitaly_praefect_read_distribution{storage="praefect-internal-1",virtual_storage="default"} 51
    gitaly_praefect_read_distribution{storage="praefect-internal-2",virtual_storage="default"} 56
  • Usage of the in-memory cache of the up to date storages:
    gitaly_praefect_uptodate_storages_cache_access_total{type="evict",virtual_storage="default"} 9
    gitaly_praefect_uptodate_storages_cache_access_total{type="hit",virtual_storage="default"} 80
    gitaly_praefect_uptodate_storages_cache_access_total{type="miss",virtual_storage="default"} 10
    gitaly_praefect_uptodate_storages_cache_access_total{type="populate",virtual_storage="default"} 10
  • After disabling of the cache by sending invalid notification:
    gitaly_praefect_uptodate_storages_cache_access_total{type="evict",virtual_storage="default"} 9
    gitaly_praefect_uptodate_storages_cache_access_total{type="hit",virtual_storage="default"} 80
    gitaly_praefect_uptodate_storages_cache_access_total{type="miss",virtual_storage="default"} 34
    gitaly_praefect_uptodate_storages_cache_access_total{type="populate",virtual_storage="default"} 10
  • Cache enabled again after successful notification received:
    gitaly_praefect_uptodate_storages_cache_access_total{type="evict",virtual_storage="default"} 16
    gitaly_praefect_uptodate_storages_cache_access_total{type="hit",virtual_storage="default"} 92
    gitaly_praefect_uptodate_storages_cache_access_total{type="miss",virtual_storage="default"} 57
    gitaly_praefect_uptodate_storages_cache_access_total{type="populate",virtual_storage="default"} 12

I have also done small performance testing on my laptop using gitaly-bench.
The setup is 1 praefect and 3 gitaly nodes behind it. It shows next results:

  • without reads distribution
    ./gitaly-bench \
     --host unix:///Users/pstrokov/Workspace/gitlab-development-kit/praefect.socket \
     -repo @hashed/19/58/19581e27de7ced00ff1ce50b2047e7a567c76b1cbaebabe5ef03f7c3017bb5b7.git \
     -concurrency 10 \
     -iterations 1000 \
     find-commit \
     -revision a97f4d2be228e8a7d1f714447fdba2b81d0b3ac5
    Stats:
     Total requests: 10000
     Elapsed Time (sec): 96.2363
     Average QPS: 103.91
  • with reads distribution
    ./gitaly-bench \
     --host unix:///Users/pstrokov/Workspace/gitlab-development-kit/praefect.socket \
     -repo @hashed/19/58/19581e27de7ced00ff1ce50b2047e7a567c76b1cbaebabe5ef03f7c3017bb5b7.git \
     -concurrency 10 \
     -iterations 1000 \
     -features distributed_reads \
     find-commit \
     -revision a97f4d2be228e8a7d1f714447fdba2b81d0b3ac5
    Stats:
     Total requests: 10000
     Elapsed Time (sec): 56.6595
     Average QPS: 176.49
Edited by Pavlo Strokov

Merge request reports