v0.7.0 — token-burn: Main session over-accumulation — summary tool messages for sub-tool results
Problem
CC's main loop accumulates every Bash result, Read excerpt, Edit, MCP response, and subagent final message verbatim into bro's context. Bro does not get a summary of a tool result — bro gets the full payload. The trajectory DB encourages "ask the DB for context" but the DB returns it as uncompressed text into the active window.
Largest empirical offenders (verified 2026-05-17):
file_registry_list— 925 summaries × 155 chars avg = 143 KB / ~36 K tokens for a full dump. Dominates other tool responses by 3–10×.issue_get_with_discussionson a hot issue — joins all discussion rows; can hit 20+ KB on issues with long histories*_report_md—issue_report_md+branch_report_mdmaterialize all sections; observed 8–15 KB in v0.6.0 sessionsaudit_log_list— smaller (67 rows × 142 char avg content_json = ~9.5 KB) but still uncapped
Combined with no auto-compaction within the main session, context grows linearly until CC's hard limit forces compaction (which itself burns tokens).
Evidence (verified against live DB)
Repeat measurements from 2026-05-17:
file_registry: 927 rows, 925 with summaries, total summary chars = 143,460audit: 67 rows, total content_json chars = ~9,500discussions: 25 rows, total body chars = ~24,375
Plan
Two complementary modes — pagination (#2905 (closed)) controls how many rows return; compact-mode controls how dense each row is.
compact: truedefault on all read-heavy tools. Compact mode returns:- For list tools:
{id, status, key_field}projection — not full rows - For join tools: counts + last-N + totals — not full joins
- For markdown reports: top-level metadata + last 5 events (~500 tokens)
- For list tools:
include_full: boolopt-in — explicit, never silentfields: string[]projection for JSON-array responses — bro often needs onlyid, status, objective, not the full row- Server-side markdown report modes:
summary | detail; defaultsummary - Document the "DB-as-context UI" tradeoff in
docs/contributing/ARCHITECTURE.mdso contributors understand the tax and design around it.
Acceptance criteria
- L2 tests confirm default
compact=truemode for all read-heavy tools returns ≤2 KB on the live populated DB. - New benchmark in
tests/benchmarks/tokens-per-bro-turn.shtracks average tokens-per-bro-turn over a representative session; baseline + post-fix recorded. - No bro skill code paths break (they receive compacted form by default; explicit
include_fullwhere the body is needed).
Out of scope
- LLM-based summarization (separate concern; not free).
- Streaming responses (MCP protocol concern).
Coordination
- Pairs with #2905 (closed) (pagination) — together they bound both row-count and row-density.
- Pairs with #2919 (P1 hallucination) — compact mode + enriched
file_registry(kind, line_count) lets bro answer "is X a real skill?" cheaply. - Pairs with #2906 (caps) — capped fields are deterministic for compact-mode shaping.
Note on source
Previous description estimated file_registry at "50–100 KB tokens." Verified empirically: 143 KB chars ≈ 36 K tokens. Updated.