Use repository recent objects size for stats
What does this MR do and why?
Use repository recent objects size for project statistics.
Project statistics for repository storage size can be misleading, as we currently use the full disk size of the repository when calling project.repository.size
.
That means the size often includes objects that are out of reach for a user, and require waiting for housekeeping tasks to be completed to see accurate sizes being reported.
As part of #402988 (closed), a new RPC was made available, RepositoryInfo
, which gives a more granular breakdown of a repository's size, in particular, exposing the recent object size, which is what we should be using for project statistics repository size - confirmation here: #418243 (comment 1490026720)
To start using the new size, this MR:
- adds a method for returning the recent objects size, to the raw repository (
Gitlab::Git::Repository
) and the model (Repository
) - adds a feature flag,
recent_objects_for_project_statistics
- conditionally uses the new size for
project_statistics#repository_size
, depending on the status of the new feature flag
Refs #419903 (closed)
How to set up and validate locally
Basic statistics refresh:
- In rails console enable the feature flag
Feature.enable(:recent_objects_for_project_statistics)
- Trigger a refresh of a project statistics record:
statistics = ProjectStatistics.last statistics.refresh!
Recent objects (new) vs total disk size (old) test:
- With the feature flag disabled, create a project and upload a file that consume a reasonable storage size (e.g. 2MiB)
You can generate a file in MacOS with:
dd if=/dev/urandom bs=2M count=1 of=2_mib_file_name
- Navigate to the usage quotas page for the project (
/your-group/your-project/-/usage_quotas#storage-quota-tab
) - Hit
Recalculate repository usage
- Refresh the usage quotas page and you should see the 2MiB storage usage
- Confirm repository size stored in project statistics:
project = Project.find(id-of-your-project) project.statistics.repository_size # Value should be something like 2097152 (depending on your project/files added previously)
- Clone the project and delete the file you previously uploaded
git clone ssh://git@gdk.test:2222/your-group/your-project.git && cd your-project rm 2_mib_file_name git add .; git commit -m 'remove the file'; git push
- Rewrite the repo history to remove the file entirely, as described in these docs: https://docs.gitlab.com/ee/user/project/repository/reducing_the_repo_size_using_git.html#purge-files-from-repository-history (warning: this is a lengthy process - you do not need to wait the 30 mins as described in step 14)
- Check the statistics repository size (either in Usage Quotas or as above or in a rails console), it will still report 2MiB
statistics.refresh! statistics.repository_size => 2097152
- Enable the feature flag (
Feature.enable(:recent_objects_for_project_statistics)
) - Refresh the statistics and check again - it should now be lower, without the 2MiB file
🎉 statistics.refresh! statistics.repository_size => 2873
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.