In-place 0.14 → 0.15 upgrade fails silently when VM_AUTH_USERNAME / VM_AUTH_PASSWORD are missing in existing .env
## Summary While testing the in-place upgrade path on a 0.14 monitoring deployment (`docker compose pull && docker compose up -d` flow, as documented in README "Upgrading"), I found that the upgrade fails silently for users whose `.env` was generated before VictoriaMetrics basic auth landed (commit `46ed2f3`, "feat(security): add HTTP Basic Auth to VictoriaMetrics"). The greenfield `postgresai mon local-install` path already handles this (it writes `VM_AUTH_USERNAME` / `VM_AUTH_PASSWORD` to `.env`), but the in-place upgrade path doesn't migrate an existing `.env`. ## Symptoms (reproducible on any 0.14 → 0.15 upgrade where `.env` lacks `VM_AUTH_*`) 1. **`sink-prometheus` exits immediately** with: ``` fatal cannot read "/postgres_ai_configs/prometheus/prometheus.yml": cannot expand environment variables: missing "VM_AUTH_USERNAME" env var ``` `config/prometheus/prometheus.yml` and `config/grafana/provisioning/datasources/datasources.yml` reference `\${VM_AUTH_USERNAME}` / `\${VM_AUTH_PASSWORD}`; on 0.15 these are required, not optional. 2. **Grafana's provisioned `PGWatch-Prometheus` datasource** references the same vars for basic auth and silently provisions with empty credentials (or `:?` failures, depending on compose version), so VictoriaMetrics queries return 401. 3. **`postgresai mon update` runs `git pull` + `docker compose pull`, then prints "✓ Update completed successfully"** — it does not migrate `.env`, so the very next `docker compose up -d` fails for the user with the message above. There is no warning that new required env vars exist. 4. **`postgresai mon update-config` runs the `sources-generator` container** but also does not append the new required env vars to `.env`. A version-aware migration step is missing — something like `grep -q '^VM_AUTH_USERNAME=' .env || append_with_random`. 5. **No "env contract" between a `PGAI_TAG` version and the keys required in `.env`.** There is currently no template / manifest of "these keys must exist for version X" — every version that introduces a new required env var risks repeating this break for any user not on a brand-new install. ## Workaround a user has to discover After the upgrade fails, the user has to manually: ```bash echo "VM_AUTH_USERNAME=vmauth" >> .env echo "VM_AUTH_PASSWORD=\$(openssl rand -base64 18)" >> .env docker compose up -d --force-recreate sink-prometheus grafana ``` `scripts/rotate-vm-auth.sh` does exactly this and is referenced from the README, but it's a rotation tool — discoverability for a first-time upgrader is low, and nothing in `postgresai mon update` points at it. ## What we should fix - `postgresai mon update` and `postgresai mon update-config` should both run the same `.env` migration that `mon local-install` does — i.e., append any missing required keys (`VM_AUTH_USERNAME`, `VM_AUTH_PASSWORD`, `REPLICATOR_PASSWORD`, …) with safe random defaults before doing anything else. The migration is purely additive; existing values must be preserved verbatim. - The README "Upgrading" section (added in !170) should explicitly mention the `pgai mon update` path now performs this migration, so future users don't fall back to raw `docker compose pull` + `up`. - A regression test in `cli/test/upgrade.test.ts` should pin this: given a 0.14-shaped `.env` (no `VM_AUTH_*`), running the upgrade path must result in both keys being present. ## Why this matters / SOC2 angle This is a silently-failing upgrade for any customer who installed before VictoriaMetrics basic auth was required. It also means the monitoring datasource ends up with an empty password — which would otherwise look like a credentials-leaked scenario in audit logs.
issue