Check disk space requirements before starting a project import
Summary
When importing large projects (e.g. >300GB), the import can fail due to insufficient disk space. Currently there is no validation of available disk space before starting the import process, and when it does fail due to space constraints, the error messages are not helpful.
Problem
- Large project imports can fail partway through due to running out of disk space, wasting significant time and resources.
- The failure mode is ungraceful — error messages do not clearly indicate that disk space was the root cause.
- This has been observed in multiple support requests (see related RFH below).
Proposal
- Spike: Investigate how to properly estimate disk usage for an import. This may be non-trivial (compressed vs. uncompressed sizes, temporary files, Git object overhead, etc.). Find the best approach for reliable estimation.
- Validate disk space before import: Before starting an import, estimate the required disk space and compare it against available space. If insufficient, fail early with a clear error message.
- Improve error messaging: When an import does fail due to disk space, surface a clear, actionable error message to the user/admin rather than a generic failure.
- Question: Should we impose limits based on the current queue status? APIs and UI would return a relevant error such as 429 for client rate limiting or 503 for server limiting
-
Spike: Following 2026-02-20: Sidekiq jobs on catchall shard exce... (gitlab-com/gl-infra/production#21346 - closed) explore the possibility of booting a dedicated sidekiq shard to handle a large import, to avoid the
catchallandimport_shared_storageshard becoming over saturated.
Related
Edited by Carla Drago