Skip to content

Reduce excessive GC on pull mirrors

Stan Hu requested to merge sh-fix-excessive-project-mirror-gc into master

In gitaly#1686 (closed) and gitlab-com/gl-infra/scalability#20 (closed), we saw a lot of I/O consumed by Git garbage collection running on pull mirrors.

It looks like the problem happened because Projects::AfterImportService always ran after completion of a project import sync, which caused HousekeepingService to run. Since the pushes_since_gc counter never changed, HousekeepingService would always run GC again.

This commit increments the pushes_since_gc counter so that GCs are only run every 200 syncs. Ideally we'd only run HousekeepingService only if the repository actually changed, but right now we don't have an easy way of detecting that.

Note that this problem does NOT happen if the mirror is a imported directly via a URL. It only happens if the import_type is Gitlab::ImportSources.importer_names (e.g. bitbucket, github, etc).

Closes gitaly#1686 (closed)

Edited by 🤖 GitLab Bot 🤖

Merge request reports