Cost-aware multi-objective scheduling system

• Cloud Cost Catalog: CSV-based pricing service (AWS real API, Scaleway, others mock) with provider registry pattern • Multi-Objective Scheduler: Elastic scaling algorithm using utilization thresholds and GPU architecture equivalency• Clean Architecture: SitePlacementService delegates to MOS, backward compatible via feature flags Production Integration Required:

• Connect to ActionDefinition repository for wall time and resource requirements • Collect CPU/memory/GPU utilization baselines from completed workflow executions • Load cloud pricing data from CSV catalogs (refresh every 8 hours) • Enable via enable_prediction=True flag in SitePlacementService Key Algorithms:

• Bounded resources (utilization >90% CPU, >85% GPU) scale proportionally with ceiling constraints • Unbounded resources maintain absolute usage regardless of target instance size • Cross-architecture GPU scaling with performance ratios (H100: 4x V100 speed) • TOPSIS optimization balances performance, cost, and energy objectives

Edited by Michael Mercier
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information