Auto-detect parallelism for pg_dump and pg_restore from RDS instance type
Related issue: #701 (closed)
Summary
Add automatic parallelism detection for pg_dump and pg_restore operations based on vCPU counts. The dump parallelism is determined by querying the RDS clone instance type's vCPU count via EC2 API, while restore parallelism is based on the local machine's CPU count.
Key Changes
-
New parallelism module (
parallelism.go): Implements vCPU detection logicResolveParallelism(): Main entry point that determines optimal parallelism levelslookupInstanceVCPUs(): Queries EC2 API for RDS instance type vCPU informationrdsClassToEC2Type(): Converts RDS instance class format (e.g., "db.m5.xlarge") to EC2 type format ("m5.xlarge")resolveLocalVCPUs(): Returns the local machine's CPU count- Includes EC2 API client initialization with support for custom endpoints
-
Comprehensive test coverage (
parallelism_test.go): Tests all parallelism resolution functions with mocked EC2 API- Tests RDS class to EC2 type conversion with valid and invalid inputs
- Tests vCPU lookup with various scenarios (valid instances, missing instances, missing vCPU info, API failures)
- Tests local vCPU resolution
-
Integration with refresh workflow (
refresher.go):- Step 2 now resolves parallelism levels before creating RDS clone
- Gracefully handles parallelism detection failures with fallback to defaults
- Passes resolved parallelism to DBLab config update
-
DBLab config updates (
dblab.go):- Extended
SourceConfigUpdatestruct withDumpParallelJobsandRestoreParallelJobsfields - Updated
UpdateSourceConfig()to conditionally include parallelism settings (only when > 0)
- Extended
-
Test coverage (
dblab_test.go):- Added tests for successful config updates with parallelism settings
- Added test verifying parallelism fields are omitted when zero
Implementation Details
- Minimum parallelism level is enforced at 1 job to ensure at least some parallelism
- RDS instance class validation ensures proper "db." prefix before EC2 API queries
- Graceful degradation: if parallelism detection fails, the refresh continues with default values (0, which preserves existing DBLab settings)
- EC2 API client supports custom endpoints for testing and non-standard AWS deployments