Skip to content

Pack Objects Limit

John Cai requested to merge jc-pack-objects-limit into master

Pack Objects Limiter

The operation that most frequently causes saturation on Gitaly nodes is git-pack-objects, which is initiated via git-fetch, git-clone, or git-pull. This will start a pack-objects process on the server through the pack objects hook. Since pack-objects can be a very expensive operation, we added a pack objects cache to help server clones when many of the same clones are made in quick succession.

If enough cache misses happen however, we can still end up with saturation. If we can limit these after the cache miss, then we can effectively push back on whoever is doing massive amounts of clones.

This can be even more effective than the concurrency limiter because those are by RPC, and an PostUploadPackWithSidechannel or SSHUploadPackWithSidechannel could potentially get a cache hit in the pack objects cache--and if so we wouldn't want to limit that request since serving it wouldn't cost the system much.

This change adds a new config [pack_objects_limiting] that defines a limit per key to use to guard git-pack-objects. Keys that are supported are "user", and "repo".

This change hardcodes a value for the limit, and includes two feature flags--one to limit by user and the other to limit by repo. This way, we avoid the need to make changes in omnibus gitlab and chef-repo to experiment in production to see how tis works.

fixes: #4401 (closed)

Edited by John Cai

Merge request reports