How does GitLab behave if a pool repository that should exist in Gitaly cannot be found?
This is one of the edge cases around Git object deduplication. We track pool repositories in the GitLab SQL database but it can happen that the SQL record for a pool got created but the repository did not get created on disk in Gitaly.
- How will GitLab behave in this scenario? Will the user notice? Can there be data loss?
- How will GitLab recover from this scenario?
A particular example of this is Geo, where we currently have this logic:
def create_object_pool_on_secondary?
return unless ::Gitlab::Geo.secondary?
return unless project.object_pool_missing?
return unless pool_repository.source_project_repository.exists?
true
end
I wonder if it would be better to initialize an empty pool if there is no source project, so that at least there is a pool. Otherwise we are stuck with this inconsistency between SQL ("there is a pool repository") and Gitaly ("no, there is no pool repository").
Edited by Jacob Vosmaer