Caching for git cat-file
From what I understand we see cases on gitlab.com where a /raw/ blob, which serves the straight contents of a file in a repository, gets very popular. This then puts a strain on the system. (cc @pcarranza )
Because we actively pack Git repositories (with git repack
, git gc
etc.) such a blob would have to be decompressed from disk on each request. Perhaps it would be good to cache the unpacked blob on disk in some form. This would be similar to what we do with requests that run git archive
: there too we use an on-disk cache for the responses.
One thing I am not sure about yet is whether we should cache inside Git (by creating a loose object file) or if we should 'own' the cache files ourselves. Either way I like the idea that git gc
cleans up the cache files.
cc @chriscool