Support cleaning up orphaned LFS files in design repositories
About
At time of writing there is a rake task that can be run by admins that will "clean up" orphaned LFS Objects. This is scheduled to be run automatically in a near-milestone #223621 (closed).
When run, it allows a project to reduce their recorded LFS usage quota to take into account any files that have been deleted. It also sets up RemoveUnreferencedLfsObjectsWorker to later actually delete the files.
Problems
The task only runs on the main project repository
Currently, OrphanLfsFileReferences, the class that handles the task, will only clean up orphaned LFS Files within the main project repository, and is unaware of other repositories, like the design repository, which also stores files with Git LFS.
The task contains a bug that might break designs
In the situation where a file has been added to an LFS-enabled project repository and also to the same project's design repository, and then deleted from the project repository, OrphanLfsFileReferences will currently cause the project to lose its ability to lookup the design which will render the design file as permanently broken.
This happens due to the scoping of the LfsObjectsProject records to be deleted, which currently does not pay attention to selecting the LfsObjectsProject by repository_type.
In the above scenario, the class will also destroy the LfsObjectsProject where repository_type is :design in addition to the one where repository_type is :project.
Proposal
Make OrphanLfsFileReferences work for all repositories that LfsObjectsProject knows of.
Pay attention when scoping which LfsObjectsProject records to destroy, and scope by the relevant repository_type.
IMPORTANT NOTE: At time of writing repository_type is nullable. A null repository_type value is considered to mean :project. The scoping will need to take into account this.
Backlog Refinement by @alexkalderimis:
Documentation needed:
None needed in addition to what we currently have.
Test Activity Planned:
Unit tests for orphan_lfs_file_references.rb, parameterized by repository_type.
E2E Scenario tests that verify that this task never makes design files unavailable.
Security Tests Planned:
None
Explanation for Approach to be taken:
The proposed method in the description sounds excellent.
MR Breakdown:
- Add scopes for the different
repository_types, paying attention to the meaning ofnull. - Use these scope in
OrphanLfsFileReferences
Weight Estimate: 2-3
Backlog Refinement by @.luke:
Documentation needed:
None. The existing docs should be enough.
Test Activity Planned:
Unit tests of OrphanLfsFileReferences.
Explanation for Approach to be taken:
Follow the Proposal in this issue's description.
We may want to change OrphanLfsFileReferences to iterate over each repository type and effectively call remove_orphan_references three times, passing in repository details. We may want to think about adding some small classes to help.
MR Breakdown:
1 MR
Weight Estimate:
2
Backlog Refinement by @toupeira:
Documentation needed: None
Test Activity Planned: Add tests around the new behaviour, and the current bug.
Security Tests Planned: None
Explanation for Approach to be taken: The proposal sounds solid!
MR Breakdown: 1 MR
Weight Estimate: 2-3 (1 extra point for unfamiliarity with LFS)