Analyze uncached PackObjectsHook traffic
Once gitlab-org/gitaly!3084 (merged) is in we should be able to selectively / temporarily / permanently enable PackObjectsHook and gather log data and metrics to see how much data is read/written and how much of that data is unique.
Another aspect we can already look at is the distribution over time of cache keys. What would be good retention times? And we can weigh this by the amount of data stored per key.
Questions
- Is 5 minutes a reasonable retention period?
- Yes. Going to two minutes reduces the cache hit bytes ratio by 13% on one of the servers: https://docs.google.com/spreadsheets/d/169uioqo8PFPPq5fX9iBNOk6YZpD5ONAjydqJH4eb2qs/edit#gid=1061164467 Going up to ten minutes does see a jump, but we're talking 'reasonable' here, so put another way: there's no strong reason to suggest a different value.
- Come up with a list of projects, besides gitlab-org/gitlab, that would put some pressure on this cache. We have to choose projects, not Gitaly servers, because of the way that feature toggling works.
- gitlab-com/www-gitlab-com is a big one. I would also like to consider fluidattacks/product, which is on nfs-file48. Unfortunately the other servers with large max cache sizes are not dominated by a single repo in the same way, so we might want to pick the top N from those servers.
- There is an interesting project on nfs-file08 which is 25 GB in size and fetched infrequently (3 hits and 13 misses in a day).
- How large will the caches be?
- Around 30 GB per server is the maximum cache size we observed, which is comfortably under the remaining free space on all Gitaly servers.
Edited by Sean McGivern