feat(compose): parameterize per-service cpus/mem_limit and VictoriaMetrics search flags
Closes #176 (closed)
Summary
Builds on main's existing ${VM_RETENTION_PERIOD:-336h} (introduced by !259 (merged)) and adds env-var indirection where main still hard-codes literal values:
- Per-service
cpus:andmem_limit:for all 12 services —${<SERVICE>_CPUS:-...}/${<SERVICE>_MEM:-...}(e.g.SINK_PROMETHEUS_MEM,FLASK_MEM,CADVISOR_CPUS,TARGET_STANDBY_MEM). Memory defaults written in bytes for consistency with !238 (merged) / !248 (merged). - Two more VictoriaMetrics tuning flags on
sink-prometheus's command — only the flags already on main, no new ones:-search.maxQueryDuration→${VM_QUERY_DURATION:-30s}-search.maxConcurrentRequests→${VM_MAX_CONCURRENT_REQUESTS:-16}
.env.exampledocuments every new variable, commented-out, with default + one-line description.- Contract tests in
tests/compliance_vectors/test_compose_parameterization.py(TDD: test commit first) verify (a) defaults match main's literals and (b) env-var overrides actually take effect for representative services.
26 new env vars total (12 services × _CPUS + _MEM, plus the 2 VM flag vars). Defaults match main's pre-MR literals byte-for-byte; this is a no-op when env vars are unset.
Test plan
-
python3 -c "import yaml; yaml.safe_load(open('docker-compose.yml'))"exits 0. -
PGAI_TAG=0.15.0-rc1 REPLICATOR_PASSWORD=x VM_AUTH_USERNAME=x VM_AUTH_PASSWORD=x docker compose -f docker-compose.yml config --quietexits 0 with no monitoring tuning env vars set. -
python3 -m pytest tests/compliance_vectors/test_compose_parameterization.py tests/compliance_vectors/test_flask_resources.py tests/compliance_vectors/test_cadvisor_resources.py— 37 passed, 1 skipped (helm not available locally). -
bash tests/compliance_vectors/check_compose_retention_config.sh— all five retention scenarios pass with the new template form. - First commit (
test: assert resource limits resolve from env vars ...) fails against currentmain(TDD red), confirming the new tests actually exercise the indirection. - CI pipeline green.
What's NOT changed
- Defaults preserved. Every parameterized value resolves to main's pre-MR literal when unset. Laptop-dev workflow unaffected.
- No new VictoriaMetrics flags.
-memory.allowedPercentand-search.maxQueueDurationare deliberately out of scope — they aren't on main'ssink-prometheuscommand today, and adding them would be new behavior, not parameterization. - No production tunings. This MR only makes prod tuning possible without forking compose. Bumping actual prod defaults is a follow-up on the provisioning playbook.
- Helm chart untouched.
postgres_ai_helm/already exposes resource overrides viavalues.yaml(!238 (merged) for Flask, !248 (merged) for cAdvisor). This MR is the compose-side counterpart. init-configs.shand configs image untouched — separate concern (see !250 (merged)).
Related
- !238 (merged) (Flask helm resources)
- !248 (merged) (cAdvisor helm resources)
- !250 (merged) (
init-configs.shidempotency, same TDD shape) - !259 (merged) (
VM_RETENTION_PERIODalready on main) - Issue #176 (closed)
Edited by Nikolay Samokhvalov