Geo selective sync changes don't clean up replicated files associated with a project
The following discussion from !3595 (merged) should be addressed:
We don't clean up files associated with the project here :( that's a bug - we should have a follow-up issue for it.
A Geo secondary that uses selective sync only syncs a subset of all groups. The repository and any associated files for projects in those groups are synchronized, and projects outside of those groups are left untouched.
When the selective sync list is changed, the primary sends an event to the secondary to notify it. In response, the secondary removes the repository of any projects no longer in the list. However, it doesn't clean up associated files.
Not all files are associated with a project, so this is a fairly tricky thing to solve. A suggestion I made some time ago was to add an optional
project_id to the
Geo::FileRegistry model, which could be filled in at synchronization time. Doing this would simplify the implementation of this issue significantly, and would also allow us to simplify some other queries related to file synchronization.
I also don't know how LFS objects behave in this situation at the moment. If an LFS object is referenced by two projects, but only one of them is in the selective sync scope, then we should keep the object. It should only be removed if none of the projects are in the selective sync scope. I suspect that in actuality, the selective sync list is ignored here.