Sign in or sign up before continuing. Don't have an account yet? Register now to get started.
v0.7.0 — token-burn: Cap server-side field sizes for content_json, discussion body, file_registry summary
## Problem
Several DB columns are `TEXT` with no size cap in the schema or write-path validation:
- `audit.content_json`
- `discussions.body`
- `file_registry.summary`
- `tasks.spec_body` (has `SPEC_BODY_MAX_BYTES` constant in code, default cap exists, but configurable via env)
Without enforced caps, a single rogue write (paste of a giant log, accidentally-large JSON, runaway summary) would bloat the DB and amplify into every subsequent read that joins or returns the row.
**This is a preventative issue, not a remedial one.** Live DB on 2026-05-17 shows current values are well-bounded (see Evidence). But the lack of structural caps means a future bad actor (or rogue subagent) can poison the DB with arbitrarily large blobs.
## Evidence (verified against live DB + source)
Current state — well-bounded:
- `audit.content_json`: 67 rows, avg = 142 chars, **max = 726 chars** (nowhere near the 1 MB documented limit)
- `discussions.body`: 25 rows, avg = 975 chars, **max = 2,551 chars**
- `file_registry.summary`: 925 with summaries, avg = 155 chars
Schema state — no LENGTH CHECK present:
- Verified `mcp/trajectory-server/src/schema.sql` — `grep -nE "LENGTH\(" schema.sql` returns nothing
- `TARGET_SCHEMA_VERSION = 2` in `db.ts:9`; v1→v2 migration exists; v2→v3 needed for caps
Already-capped (in code, not schema):
- `tasks.spec_body` — `SPEC_BODY_MAX_BYTES` constant in `tools/tasks.ts`, used by `composites.ts` for `task_retry_batch`
- `roundtable_create.retry_rationale` — "≤200 chars" per schema description (advisory, not enforced server-side)
## Plan
Targets are deliberately conservative — set caps generous enough that normal writes don't trip them, strict enough to prevent runaway bloat.
1. **Schema v3** — add `CHECK(LENGTH(body) <= 4096)` on `discussions.body`. `summary <= 1024` on `file_registry.summary`. `content_json <= 8192` on `audit.content_json`.
2. **Server-level validation** — reject writes exceeding cap with `{"error": "field_too_large", "field": "body", "current_size": 4521, "cap": 4096}`. Wire through `requireRoles` middleware so the error path is uniform.
3. **Migration v3 backfill** — for existing rows over cap (currently zero, but possible in deployed instances), truncate to cap with trailing `[…truncated N chars]` marker and log to `audit` as `field_truncated_in_migration` with the original length.
4. **Lint** — `tests/lint/no-uncapped-text-columns.sh` to prevent regressions: any new `TEXT` column without `CHECK(LENGTH(...) <= ...)` or explicit allowlist must fail lint.
5. **Document caps** in `mcp/trajectory-server/docs/SCHEMA.md` (or equivalent) so contributors know the field-size contract.
## Acceptance criteria
- New L2 tests verify each capped column rejects oversize writes with structured error.
- Migration v3 backfill test verifies pre-cap rows over cap are truncated correctly.
- Live DB migration on this repo completes without truncating any current rows (all current rows are within proposed caps — sanity check).
- New lint test verifies the rule catches uncapped new columns.
## Out of scope
- Compaction of normal-size historical rows.
- Different caps per `event_type` for audit (could be a v0.7+ refinement).
## Coordination
- Pairs with #2919 (P1 hallucination) — both want v2→v3 schema migration; should ship as one migration.
- Pairs with #2918 (cache discipline) — capped fields are cache-friendlier (predictable size).
## Note on source
Previous description claimed "1 MB allowed per content_json" without verifying. Source check (`tools/audit.ts` registration) shows no enforced cap in schema or middleware — only advisory in the tool description. Verified live max sizes are tiny. Reframed as preventative architecture, not remediation.
issue