v0.7.0 — token-burn: Cap server-side field sizes for content_json, discussion body, file_registry summary
Problem
Several DB columns are TEXT with no size cap in the schema or write-path validation:
audit.content_jsondiscussions.bodyfile_registry.summarytasks.spec_body(hasSPEC_BODY_MAX_BYTESconstant in code, default cap exists, but configurable via env)
Without enforced caps, a single rogue write (paste of a giant log, accidentally-large JSON, runaway summary) would bloat the DB and amplify into every subsequent read that joins or returns the row.
This is a preventative issue, not a remedial one. Live DB on 2026-05-17 shows current values are well-bounded (see Evidence). But the lack of structural caps means a future bad actor (or rogue subagent) can poison the DB with arbitrarily large blobs.
Evidence (verified against live DB + source)
Current state — well-bounded:
audit.content_json: 67 rows, avg = 142 chars, max = 726 chars (nowhere near the 1 MB documented limit)discussions.body: 25 rows, avg = 975 chars, max = 2,551 charsfile_registry.summary: 925 with summaries, avg = 155 chars
Schema state — no LENGTH CHECK present:
- Verified
mcp/trajectory-server/src/schema.sql—grep -nE "LENGTH\(" schema.sqlreturns nothing TARGET_SCHEMA_VERSION = 2indb.ts:9; v1→v2 migration exists; v2→v3 needed for caps
Already-capped (in code, not schema):
tasks.spec_body—SPEC_BODY_MAX_BYTESconstant intools/tasks.ts, used bycomposites.tsfortask_retry_batchroundtable_create.retry_rationale— "≤200 chars" per schema description (advisory, not enforced server-side)
Plan
Targets are deliberately conservative — set caps generous enough that normal writes don't trip them, strict enough to prevent runaway bloat.
- Schema v3 — add
CHECK(LENGTH(body) <= 4096)ondiscussions.body.summary <= 1024onfile_registry.summary.content_json <= 8192onaudit.content_json. - Server-level validation — reject writes exceeding cap with
{"error": "field_too_large", "field": "body", "current_size": 4521, "cap": 4096}. Wire throughrequireRolesmiddleware so the error path is uniform. - Migration v3 backfill — for existing rows over cap (currently zero, but possible in deployed instances), truncate to cap with trailing
[…truncated N chars]marker and log toauditasfield_truncated_in_migrationwith the original length. - Lint —
tests/lint/no-uncapped-text-columns.shto prevent regressions: any newTEXTcolumn withoutCHECK(LENGTH(...) <= ...)or explicit allowlist must fail lint. - Document caps in
mcp/trajectory-server/docs/SCHEMA.md(or equivalent) so contributors know the field-size contract.
Acceptance criteria
- New L2 tests verify each capped column rejects oversize writes with structured error.
- Migration v3 backfill test verifies pre-cap rows over cap are truncated correctly.
- Live DB migration on this repo completes without truncating any current rows (all current rows are within proposed caps — sanity check).
- New lint test verifies the rule catches uncapped new columns.
Out of scope
- Compaction of normal-size historical rows.
- Different caps per
event_typefor audit (could be a v0.7+ refinement).
Coordination
- Pairs with #2919 (P1 hallucination) — both want v2→v3 schema migration; should ship as one migration.
- Pairs with #2918 (cache discipline) — capped fields are cache-friendlier (predictable size).
Note on source
Previous description claimed "1 MB allowed per content_json" without verifying. Source check (tools/audit.ts registration) shows no enforced cap in schema or middleware — only advisory in the tool description. Verified live max sizes are tiny. Reframed as preventative architecture, not remediation.