Use repository recent objects size for project statistics
Background
As part of Start using RepositoryInfo for repository size ... (#402988 - closed), a new Gitaly RPC was exposed to supply a more granular set of repository size information (#418243 (comment 1489947936)).
Gitlab::Git::Repository#size is currently used for project statistics, and returns the complete repository size reported by either RepositorySize or RepositoryInfo depending on a feature flag (#418243 (closed)).
Initially, we thought we didn't need to make any further changes (#402988 (comment 1465486583)) to utilise the new size for usage quotas/billing, but a recent discussion has shown that not to be the case (#418243 (comment 1489947936)).
RepositorySize
RepositorySize is the old RPC, it includes unreachable objects that take a while to be cleaned up, so it can lead to a frustrating experience for customers who clean up storage but don't immediately see the result in GitLab.
The size returned is in kilobytes.
RepositoryInfo
The new RPC which returns a more complex set of information, breaking down the repository size for different contexts.
For project statistics and billing, the recent objects size is what we'd want to use from this new RPC (#418243 (comment 1489947936)).
The size returned from this is in bytes.
Proposal
In order to use a more useful repository size for our customers, that does not include unreachable objects and bypasses the need to wait for housekeeping scheduled tasks etc, we should:
- expose the recent size e.g.
# lib/gitlab/git/repository.rb def recent_size gitaly_repository_client.repository_info.objects.recent_size end - change
project_statistics.update_repository_sizeto useproject.repository.recent_size - decide if we need to backfill (and handle in a separate issue perhaps)