This is problematic for Gitaly HA because we there will be directories on multiple different Gitaly servers and we don't want to directly expose those Gitaly servers.
We should take a look at the things that use ListDirectories, and reimplement whatever it is they do inside Gitaly. In some cases it might be automated cleanup; in other cases we might need a new RPC. In the latter case that RPC should express what we want to happen, and leave the actual work to Gitaly.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items
0
Show closed items
No child items are currently assigned. Use child items to break down this issue into smaller parts.
The crux of the issue is that the cleanup RPC removes non hashed storage left over namespaces or repositories. In other words, if I've removed a repository form GitLab called zj-gitlab/foo, than if we find zj-gitlab/foo.git on disk, we removed it with this rake task. I think it would suffice to remove this rake task.
The other rake task that uses this, would be to remove namespace left overs on disk, same story. Would never scale, but was ported anyway.
Concrete proposal:
Remove the gitlab:cleanup:dirs rake task and documentation
Remove the gitlab:cleanup:repos rake task and documentation
Remove the StorageService::ListDirectories RPC from Rails
Remove the StorageService.ListDirectories() from Gitaly+Proto
@jramsay I think it's OK to remove the rake tasks, even before 13.0, what is your opinion?
Sure, though we should have a way to clean a Gitaly storage post HashedStorage migration. Would you agree? In general, if there's a +moved, +orphaned, or a +deleted directory somewhere and it's stale for 30 days, should it be removed? Although I'm not sure about touching data like this, I'd rather ask a admin to clean it manually, but not sure what your opinion is about that @jramsay.