Sign in or sign up before continuing. Don't have an account yet? Register now to get started.
Register now
v0.7.0 — token-burn: Cap server-side field sizes for content_json, discussion body, file_registry summary
## Problem Several DB columns are `TEXT` with no size cap in the schema or write-path validation: - `audit.content_json` - `discussions.body` - `file_registry.summary` - `tasks.spec_body` (has `SPEC_BODY_MAX_BYTES` constant in code, default cap exists, but configurable via env) Without enforced caps, a single rogue write (paste of a giant log, accidentally-large JSON, runaway summary) would bloat the DB and amplify into every subsequent read that joins or returns the row. **This is a preventative issue, not a remedial one.** Live DB on 2026-05-17 shows current values are well-bounded (see Evidence). But the lack of structural caps means a future bad actor (or rogue subagent) can poison the DB with arbitrarily large blobs. ## Evidence (verified against live DB + source) Current state — well-bounded: - `audit.content_json`: 67 rows, avg = 142 chars, **max = 726 chars** (nowhere near the 1 MB documented limit) - `discussions.body`: 25 rows, avg = 975 chars, **max = 2,551 chars** - `file_registry.summary`: 925 with summaries, avg = 155 chars Schema state — no LENGTH CHECK present: - Verified `mcp/trajectory-server/src/schema.sql` — `grep -nE "LENGTH\(" schema.sql` returns nothing - `TARGET_SCHEMA_VERSION = 2` in `db.ts:9`; v1→v2 migration exists; v2→v3 needed for caps Already-capped (in code, not schema): - `tasks.spec_body` — `SPEC_BODY_MAX_BYTES` constant in `tools/tasks.ts`, used by `composites.ts` for `task_retry_batch` - `roundtable_create.retry_rationale` — "≤200 chars" per schema description (advisory, not enforced server-side) ## Plan Targets are deliberately conservative — set caps generous enough that normal writes don't trip them, strict enough to prevent runaway bloat. 1. **Schema v3** — add `CHECK(LENGTH(body) <= 4096)` on `discussions.body`. `summary <= 1024` on `file_registry.summary`. `content_json <= 8192` on `audit.content_json`. 2. **Server-level validation** — reject writes exceeding cap with `{"error": "field_too_large", "field": "body", "current_size": 4521, "cap": 4096}`. Wire through `requireRoles` middleware so the error path is uniform. 3. **Migration v3 backfill** — for existing rows over cap (currently zero, but possible in deployed instances), truncate to cap with trailing `[…truncated N chars]` marker and log to `audit` as `field_truncated_in_migration` with the original length. 4. **Lint** — `tests/lint/no-uncapped-text-columns.sh` to prevent regressions: any new `TEXT` column without `CHECK(LENGTH(...) <= ...)` or explicit allowlist must fail lint. 5. **Document caps** in `mcp/trajectory-server/docs/SCHEMA.md` (or equivalent) so contributors know the field-size contract. ## Acceptance criteria - New L2 tests verify each capped column rejects oversize writes with structured error. - Migration v3 backfill test verifies pre-cap rows over cap are truncated correctly. - Live DB migration on this repo completes without truncating any current rows (all current rows are within proposed caps — sanity check). - New lint test verifies the rule catches uncapped new columns. ## Out of scope - Compaction of normal-size historical rows. - Different caps per `event_type` for audit (could be a v0.7+ refinement). ## Coordination - Pairs with #2919 (P1 hallucination) — both want v2→v3 schema migration; should ship as one migration. - Pairs with #2918 (cache discipline) — capped fields are cache-friendlier (predictable size). ## Note on source Previous description claimed "1 MB allowed per content_json" without verifying. Source check (`tools/audit.ts` registration) shows no enforced cap in schema or middleware — only advisory in the tool description. Verified live max sizes are tiny. Reframed as preventative architecture, not remediation.
issue