Stability checkpoint — Customer Churn Phase C (Python ONNX inference) + dual-backend ML serving
- feat(ml): Phase C — Python in-process ONNX inference for Customer Churn
- fix(ml): drop unused noqa BLE001 on churn predictor init
- fix(ml): move runtime tests under tests/unit/ml + fix docs links
Minor bump (0.6 → 0.7) because Phase C ships a new public capability (REST + MCP surface for churn prediction) — semver MINOR per Conventional Commits.
CI :
- ✅ Main pipeline #2482908218 green — https://gitlab.com/mirador1/mirador-service-python/-/pipelines/2482908218
- ✅ MR pipeline #2482779639 green for !36 — https://gitlab.com/mirador1/mirador-service-python/-/pipelines/2482779639
- ✅ Phase C MR !36 merged via auto-merge — https://gitlab.com/mirador1/mirador-service-python/-/merge_requests/36
- ✅ Fix MR !37 + !38 (ruff RUF100 + tests path + docs links) merged — https://gitlab.com/mirador1/mirador-service-python/-/merge_requests/38
Local test pass :
- ✅ uv run pytest — 325 passed, coverage 90.49 % (gate 90 %)
- ✅ uv run ruff check src tests — clean
- ✅ uv run ruff format --check src tests — clean
- ✅ uv run mypy src — clean
- ⏭ uv run pytest tests/integration -m integration — N/A : Phase C touches the runtime ml/* slice ; no integration test added (testcontainers postgres + the prediction endpoint is covered by tests/unit/ml/test_router_churn with stub predictor + SQLite)
- ⏭ Manual MCP query against running service — deferred until Phase F provisions the ConfigMap. Tool registration verified in tests/unit/mcp/test_mount (14 → 15 expected tools)
Regression check vs stable-py-v0.6.11 :
- ✅ MCP catalogue : 14 → 15 tools (predict_customer_churn added). test_mount + test_dtos asserts the new entry.
- ✅ FastAPI lifespan boots when /etc/models/churn_predictor.onnx is missing — ChurnPredictor.is_ready() returns False, REST endpoint serves 503, every other endpoint keeps working unchanged.
- ✅ Cross-language guarantee (per shared ADR-0060) : the 8-feature extractor on this side is parity-tested against Java's golden inputs (test_inference.py mirrors ChurnFeatureExtractorTest exactly).
- LLM integration : FastMCP + Ollama (unchanged from prev tag).
- AI Observability : gen_ai.* OTel spans → Tempo (unchanged).
- **NEW** Trained model in-process : ChurnPredictor wraps onnxruntime>=1.21,<2 (added to main [project.dependencies] — 30 MB, runtime self-contained, training stack stays in optional [ml] extra). No sidecar, no network hop per inference, identical predictions across Java + Python (ADR-0060). 8-feature extractor (mirador_service.ml.inference.extract_features) parity-tested vs Java sibling. Robust to mixed-tz datetimes (SQLite vs Postgres).
- **NEW** MCP tool 15 : predict_customer_churn(customer_id) → ChurnPrediction | ChurnNotFound | ChurnServiceUnavailable. Soft-error DTOs match Java's ChurnMcpToolService shape for LLM caller robustness.
- **NEW** Risk band classification (LOW/MEDIUM/HIGH) with thresholds 0.3 / 0.7. Boundary semantics mirror Java's RiskBand.
- AuthN : JWT + X-API-Key + ApiKeyMiddleware (unchanged). New /customers/{id}/churn-prediction endpoint inherits the same chain.
- AuthZ : authentication required (no special role). Predictions are read-only.
- CVE posture : pip-audit clean ; new onnxruntime + numpy versions pinned (no floating tag).
- Headers + filters : CSP, HSTS, rate-limit, idempotency, request-id correlation all unchanged.
- New domain feature : Customer Churn prediction REST endpoint POST /customers/{id}/churn-prediction → ChurnPrediction.
- New MCP tool : predict_customer_churn (soft-error DTOs).
- Breaking-API check vs prev tag : none. Net additions only.
- Deploy targets : same multi-cloud matrix (unchanged from prev tag).
- IaC : Terraform unchanged.
- Cost discipline : ≤ €2/month idle (ADR-0022).
- ConfigMap mount path /etc/models/churn_predictor.onnx wired in deployment manifests via shared !4 (Phase F).
- SLO/SLA : 3 SLOs as code (unchanged baselines).
- Tracing / metrics / logs : OTel exporter, Tempo / Mimir / Loki tail (unchanged).
- New observability surface (Phase E) : drift SLO + KS-test daily series — DEFERRED to next session.
- Coverage : pytest --cov-fail-under=90 — 90.49 % achieved (vs 89.83 % at HEAD before per-file omit rewrite). Runtime ml/* files now contribute via tests/unit/ml/.
- Mutation : mutmut 3.5.0 (configured for auth/jwt + auth/passwords ; unchanged baseline).
- Static analysis : ruff + mypy + import-linter all clean.
- Test pyramid : +31 unit tests in tests/unit/ml/ (test_risk_band 10, test_dtos 4, test_inference 12, test_router_churn 5).
- Pipeline stages green : validate (lint/format/mypy/pip-audit/import-linter) | test (unit + integration + benchmarks) | quality (sonarcloud) | docs (mkdocs strict) | deploy.
- Compat matrix : Python 3.13 default + 3.12 + 3.11 (unchanged baselines).
- Release engineering : Conventional Commits respected (feat(ml), fix(ml)). MINOR bump (0.6.11 → 0.7.0) because new public surface.
- ADRs : shared ADR-0060/0061/0062 — Phase C amendment landed via shared !3.
- Patterns enforced : Hexagonal Lite, Feature-slicing, Clean Code 7 non-negotiables — function size ≤ 30 LOC, SRP, naming, why comments, dependency rule, test-as-spec, no dead code. New ml/ package follows the same conventions.
- File length : inference.py 264 LOC, dtos.py 76 LOC, router.py 118 LOC, risk_band.py 61 LOC, predictor_singleton.py 44 LOC — all well under the 1 000 ceiling.
- ⏭ N/A — backend-only repo.
- onnxruntime + numpy added to main [project.dependencies] (no extra needed for inference) — out-of-the-box for any developer cloning the repo.
- Coverage `omit` rewritten per file (training files keep the omit, runtime files drop it). Tests under tests/unit/ml/ run on every CI invocation, no [ml] extra needed.
- Documentation : new docs/ml/churn-prediction.md (REST + MCP usage + ONNX cross-language guarantee + model provisioning).
- ONNX file not yet provisioned in dev/CI : the prediction endpoints return 503 until bin/ml/promote_to_configmap.sh (Phase F) runs. Graceful-degradation contract verified by tests.
- top_features list is a placeholder (canonical priority sequence) until Phase E adds SHAP per-prediction explanations.
- Phase C uses a stub onnxruntime.InferenceSession in unit tests ; the real-model path is deferred to the cross-language smoke test (Phase G).
- stable-py-v0.7.1 : MLflow tracking server in dev compose stack (Phase E start).
- stable-py-v0.8.0 : drift SLO + Grafana dashboard + drift runbook (Phase E full scope).