v0.7.0 — token-burn: Constrain subagent fan-out per turn — batch cap + sequential mode for low-impact tasks

Problem

Per issue, TMB can spawn many subagents:

  • N SWE subagents (one per task in task_create_batch — no batch-size cap today; observed up to 9 in v0.6.0)
  • N pr-reviewer subagents (one per task push attempt; retries multiply)
  • 2–5 consultants per roundtable (parallel)
  • Phase-2 summary-fill subagents after every /scan (one per ~25-file batch)
  • Sub-subagents (Explore, general-purpose) spawned by bro or SWE mid-conversation

Each spawn is its own context that doesn't compress back into bro. SWE has maxTurns: 150 (built-in cost cap per spawn), but the count of spawns per turn/issue is unbounded.

Evidence (verified against live source)

  • agents/swe.md: maxTurns: 150 — single-spawn cap, but no cross-spawn cap.
  • agents/pr-reviewer.md: tools: ..., Task, ... — pr-reviewer CAN spawn its own subagents (extending the fan-out tree).
  • templates/agents/{architect,ceo,cto,pm}.md: each has model: opus — every consultant spawn pays opus pricing.
  • No server-side cap on task_create_batch batch size (verified in tools/tasks.ts).
  • PreToolUse [Agent] hooks fire on every subagent spawn (3 hooks: require-task-spec.sh, require-feature-branch-active.sh, pr-reviewer-no-worktree.sh) — multiplier cost.

Plan

  1. max_parallel_swe plugin_config key (default 4); task_create_batch larger than the cap chunks into sequential waves of 4.
  2. urgency: low | normal | high field on tasks; low tasks run sequentially regardless of cap (cosmetic/lint tasks).
  3. Default urgency: lint/cosmetic = low; normal feature/bug = normal; hotfix path = high.
  4. pr-reviewer retry cap: max 2 spawn-attempts before mandatory Human escalation. Track in pr_review_runs.retry_count.
  5. Interactive guard at large batches: ≥6 tasks in one task_create_batch triggers an AUQ-style confirmation in interactive mode ("Spawning N subagents. Confirm?").

Acceptance criteria

  • L4 simulation: task_create_batch of 12 tasks (default config) runs in 3 sequential waves of 4.
  • L4 simulation: 3rd pr-reviewer retry refuses and escalates.
  • L5 dogfood verifies no regression in throughput for normal-size batches.
  • Schema v3 includes pr_review_runs.retry_count.

Out of scope

  • Subagent result caching across retries.
  • Reducing parallel consultants in roundtables (covered in #2910).

Coordination

  • Pairs with #2910 (roundtable cap) — both bound parallel fan-out.
  • Pairs with #2906 (schema v3) — retry_count ships in that migration.

Note on source

Verified swe.md maxTurns=150 and pr-reviewer.md Task tool inclusion. Verified consultant model=opus. Verified PreToolUse [Agent] hook count (3). Previous description's spawn count estimates (8–12 per issue) are accurate.

Edited by Zax Shen