feat(search): FTS5 search infrastructure + 3 MCP search tools (#2905 Phase 1)

Phase 1 of GL !2905 (v0.7.0 search-first architecture).

What this MR ships

Pure FTS5, zero new dependencies. Phase 2 (separate MR, stacked on this branch) adds the ONNX embedding layer + hybrid ranking.

Schema v3 migration

  • TARGET_SCHEMA_VERSION 2→3
  • migrateV2toV3 adds 3 FTS5 virtual tables: discussions_fts, audit_fts, file_registry_fts
  • 9 sync triggers (per table × INSERT/UPDATE/DELETE) keep FTS in lock-step with source
  • file_registry triggers guard on NULL summary (prevents FTS corruption)
  • Backfill runs in-transaction; safe for existing DBs
  • schema.sql updated for fresh-DB path

3 new MCP tools

Tool Filters Default response
discussion_search(query, k=5, recency_alpha=0.3) issue_id?, kind? Top-K snippets, BM25 + recency-decay
audit_search(query, k=5, ...) event_types? Same shape
file_registry_search(query, k=10) path_prefix? shortcut Same shape

Snippet length capped at 16 tokens with [/] highlight delimiters. Default response stays ≤1 KB.

Cursor pagination on 6 existing tools

issue_get_with_discussions, discussion_list, audit_log_list, file_registry_list, validation_history, pr_review_runs_list gain optional limit + cursor. Default behavior unchanged when limit is unset — back-compat preserved.

Tests

  • 16 new L2 assertions in src/test/search.test.ts (all 3 tools + triggers + pagination)
  • 6 new L2 assertions in src/test/schema-upgrade.test.ts (v2→v3 migration + backfill + NULL-summary guard)
  • schema.test.ts + db.test.ts updated for schema_version=3 + 22 tables
  • bash tests/run-all.sh exit 0 — L1+L2+L3+L4 all green

pr-reviewer verdict: PASS (validation_attempts id=25)

3 non-blocking nits flagged for follow-up (cosmetic — recency_alpha unused in file_registry_search, spec mentioned wrong src/index.ts path, default-limit doc strings now slightly misleading at 200/500 caps).

Held for manual review per session pattern — no auto-merge.

Phase 2 preview

feat/search-rag-onnx-embeddings will stack on this branch and add:

  • @huggingface/transformers + onnxruntime-node + bge-small-en-v1.5 (33MB lazy-downloaded)
  • Per-content *_embeddings tables + sync triggers
  • Brute-force JS cosine search over Float32 BLOBs
  • RRF hybrid ranking (FTS5 + semantic + recency)
  • Skill body updates (tmb_planning / tmb_review / tmb_recovery) to use *_search by default

Merge request reports

Loading