output-level-invariants (#39) · Issues · Jani Päijänen / agent-governance

output-level-invariants

## Learning Record **Project codename:** blue-marlin **Date:** 2026-04-26 **Category:** gap **Severity:** significant **Framework version:** 0.34.0 ### What happened A project that produces visual / structural output (parametric CAD model of a wooden pier, generating STEP / GLB / DXF / CSV) ran successfully through five governance pipeline iterations with all 11 pytest invariants and the advisory pattern (timber + structural advisor) passing. The user (Governor-Reviewer) then opened the DXF / 3D rendering and immediately observed an X-cross brace appearing above the deck — a clear geometric error. Investigation found two bugs: 1. The rotation expression `angle_deg * direction - 90` for the cross brace flipped the box from horizontal to vertical orientation, so a 1816 mm long brace ended up standing vertically with its top 576 mm above the deck. 2. The `lower_waling_z` formula (`water_level + 400`) placed the "lower" waling 46 mm ABOVE the "upper" waling at default parameters, collapsing the X-bracing vertical span to almost zero. Both bugs pre-existed in the source code adopted by the project. Five iterations of work — running the script, regenerating outputs, re-validating — never surfaced them, because the pytest suite exercised only Mitat-class arithmetic (parameter relationships) and never inspected the produced geometric output. The fix iteration added two new invariants — `test_waling_levels_dont_overlap` and `test_cross_brace_fits_below_deck` — that operate on the same Mitat data but assert OUTPUT-LEVEL relations. Both would have caught the respective bug at iteration 1. The pattern generalises: when a project produces visible artefacts (drawings, renderings, files, UI screenshots, charts), the pytest suite should include "user acceptance" invariants — geometric, visual, or structural relations expressible as data assertions — not only unit-level mathematical correctness. Otherwise the human reviewer becomes the only safety net, and the defect surfaces only when they look. ### Agents involved coder-tester, structural-advisor (advisory, informal), governor-reviewer ### Framework section `framework.md` Pattern: Test-first `rules-template/workflow/tdd-rule.md` ### Business impact `prevented_loss` — caught two pre-existing geometric bugs (one critical: cross brace projecting through deck) that would have led to incorrect construction or, at minimum, confusing drawings handed to a builder. ### Recommendation Extend the tdd-rule (or add a new file-generator rule) to require: > When a project produces visual or structural output, write at least > one pytest invariant per OUTPUT-LEVEL relation, not just per input > parameter. Examples of output-level invariants: > - "X is below Y" (vertical ordering of components) > - "component count > 0" (presence checks) > - "no overlap between A and B" (geometric exclusion) > - "endpoint of element X equals endpoint of element Y" (connection > continuity) > > These are cheap to write (operate on the same data the unit tests > use) and cover the gap between unit math and human inspection. A concrete example of two such invariants from the blue-marlin project, applied retrospectively after the bug discovery: ```python def test_waling_levels_dont_overlap(m: Mitat): """Lower waling top must be below upper waling bottom.""" lower_top = m.lower_waling_z + m.waling_w upper_bottom = m.upper_waling_z assert lower_top < upper_bottom def test_cross_brace_fits_below_deck(m: Mitat): """Cross brace attachment points must be below deck bottom.""" deck_bottom = m.deck_top - m.deck_board_t assert m.cross_attach_upper_z < deck_bottom assert m.cross_attach_lower_z < deck_bottom ``` --- **Anonymization checklist:** - [x] No real project, product, or organisation names - [x] No internal URLs, hostnames, or infrastructure identifiers - [x] No customer or partner names

issue