feat(formatters): implement GoonFormatter encoder

Summary

Replaces the 24-line GoonFormatter stub with a real encoder. Implements the format=llm path of the Orbit API ADR. 55 tests, all passing.

Backed by ~2,000 task-runs of agent eval against real GitLab.com on Haiku 4.5 (gitlab-org/orbit/gkg-evals-harness!1 MR note 3331102607). The kv shape was the only variant strictly Pareto-dominant over raw JSON: -11% cost, -15% duration, +4.8pp correctness, p=0.043 on tool_sequence_length. The @hints block was dropped because the min variant matched kv on accuracy at lower token cost.

Closes #271 (closed). Relates to !904 (merged).

[skip response-schema-version-check]

The lib.rs and graph.rs edits in this MR are wire-neutral (added Clone derive on GraphNode/GraphEdge, added GOON re-exports). They do not affect the RAW response shape, so no RAW_OUTPUT_FORMAT_VERSION bump is required. The new goon-format-version-check (added in commit e013fe26c) does fire and confirms config/GOON_OUTPUT_FORMAT_VERSION is set at 1.0.0 for the initial release.

Format shape

@header
query_type:traversal
format_version:1.0.0
nodes:13
edges:16

@nodes
MergeRequest(8):
482927048 iid=18 state=merged title="chore: move skill to project scope"
...

@edges
AUTHORED(8):
User:5252563 --> MergeRequest:482927048
...

For path_finding: edges become @paths chains. For aggregation: column functions surface in @header; grouped values appear inline on each node, ungrouped values appear in a values: line.

Test plan

55 tests in the formatters crate, all passing under cargo test --all-targets and cargo clippy --all-features --all-targets.

Layer File Count What it covers
Unit src/goon/tests.rs 30 Header structure, sections, quoting/escaping, datetime normalization, truncation, numerics, edges, dedup, path-finding, aggregation
Snapshot (insta) tests/goon_snapshots.rs 6 One golden file per query type plus pagination and ungrouped aggregation. cargo insta review flags drift on every commit.
Property (proptest) tests/goon_properties.rs 4 × 64 cases Shuffle invariance, idempotence, control-char invariant, header prefix
Trait wiring src/goon/mod.rs 2 format_name returns Goon, format_version returns Some(&GOON_OUTPUT_FORMAT_VERSION)
Pre-existing src/graph.rs 13 GraphFormatter::build_response (the encoder's input)
Determinism guarantees locked by tests
  • shuffle_invariant: 64 random payloads × random shuffles must encode to byte-identical output. Catches non-deterministic sort tie-breakers.
  • encoding_is_pure: same input encodes to same bytes across calls. Catches hidden state.
  • output_starts_with_header: @header\n is always first. Catches accidental whitespace/BOM prefixes.
  • no_unescaped_control_chars: no line contains a raw \r or \t. Catches control-char leaks that would silently merge tokens (the bug that bit the Python prototype).

The total ordering is: nodes sorted by (entity_type, id); edges sorted by (path_id, step, edge_type, from, from_id, to, to_id, depth). Aggregation rows preserve server order so a count-DESC ranking on the wire stays in count-DESC order.

Value formatting rules
  • Value::Null and empty strings are omitted from node rows.
  • Booleans render as true / false.
  • Non-finite floats (NaN, ±Inf) are dropped.
  • ClickHouse 2026-05-08 22:55:58.467450 is normalized to 2026-05-08T22:55:58.467450 so the value is bare-emittable without breaking the space-delimited row format.
  • Strings of [A-Za-z0-9_\-:./@+] characters or ISO datetimes pass bare. Anything else is quoted with "..." and \\ / \" / \n / \r / \t escapes.
  • Long-text columns (title, description, body, name, note) truncate at 200 chars with a sibling <key>_len=N breadcrumb.
  • Any other string is hard-capped at 1,000 chars with the same breadcrumb.
  • Integer node IDs render bare regardless of size (handles values past 2^53).
Path-finding example
@header
query_type:path_finding
format_version:1.0.0
nodes:3
edges:2
@nodes
MergeRequest(1):
482927048 iid=18 state=merged
Project(1):
278964 name=GitLab
User(1):
64248 username=stanhu
@paths
path=0: User:64248 --AUTHORED--> MergeRequest:482927048 --IN_PROJECT--> Project:278964
Aggregation examples

Grouped (per-user merged_count, count desc, server order preserved):

@header
query_type:aggregation
format_version:1.0.0
nodes:3
edges:0
aggregations:merged_count(count)
@nodes
User(3):
1243277 username=ghost1 merged_count=65555
35702613 username=bot_a merged_count=21277
26832240 username=bot_b merged_count=20289

Ungrouped (single scalar):

@header
query_type:aggregation
format_version:1.0.0
nodes:0
edges:0
aggregations:total(count)
values:total=2347

Code organization

crates/query-engine/formatters/src/goon/
├── mod.rs        GoonFormatter trait impl + GOON_OUTPUT_FORMAT_VERSION LazyLock
├── encode.rs     encode(&GraphResponse, &Version) -> String
├── fixtures.rs   test-only fixture builders (cfg(test))
└── tests.rs      30 unit tests (cfg(test))

crates/query-engine/formatters/tests/
├── goon_snapshots.rs    6 insta golden tests
├── goon_properties.rs   4 proptest invariants
└── snapshots/           6 .snap files (committed)

config/GOON_OUTPUT_FORMAT_VERSION   1.0.0

The encoder is pure-typed-Rust against GraphResponse, no JSON parsing. GoonFormatter::format is one composition: Value::String(encode(GraphFormatter.build_response(output), &GOON_OUTPUT_FORMAT_VERSION)). Clone was added to GraphNode and GraphEdge for test fixture reuse; this is a pure-trait change with no wire impact.

Version-bump methodology

config/GOON_OUTPUT_FORMAT_VERSION follows the same versioning discipline as RAW_OUTPUT_FORMAT_VERSION. New files added in this MR:

  • scripts/check-goon-format-version.sh mirrors check-response-schema-version.sh. Watches crates/query-engine/formatters/src/goon/**.rs, graph.rs, lib.rs. Demands config/GOON_OUTPUT_FORMAT_VERSION be bumped on any change.
  • lefthook.yml runs the new check pre-commit alongside the RAW check.
  • .gitlab-ci.yml adds the goon-format-version-check job in the lint stage, MR-only.
  • AGENTS.md / CLAUDE.md "What CI enforces" list now mentions both checks.

Skip the check on wire-neutral edits with [skip goon-format-version-check] in the MR description or SKIP_GOON_FORMAT_VERSION_CHECK=1 locally, identical to the RAW pattern.

Verification commands

mise run trust && mise run test:fast
cargo test -p formatters --all-targets
cargo clippy -p formatters --all-features --all-targets -- -D warnings
cargo fmt --check
cargo insta review   # accept any snapshot drift on intentional format changes
bash scripts/check-goon-format-version.sh origin/main
bash scripts/check-response-schema-version.sh origin/main

Limitations and follow-ups

  • The encoder is single-model-validated (Haiku 4.5). KG-LLM-Bench shows up to 17.5pp swing per format/model pairing. Cross-model validation (Sonnet 4.6, Opus 4.7, GPT-5, Gemini 2.5) is open follow-up tracked alongside #271 (closed).
  • format_version is 1.0.0; bump per the existing semver-major-bump CI hook (response-schema-version-check) when the wire shape changes.
  • The Python prototype at gkg-evals-harness:vendor/skills/orbit-goon-kv/scripts/goon_encode.py served as the spec. Both encoders should produce byte-identical output on identical input. Cross-language parity tests are deferred to a follow-up MR; the property + snapshot coverage already locks the Rust side.

/cc @michaelangeloio

Edited by Michael Angelo Rivera

Merge request reports

Loading