Tags

Tags give the ability to mark specific points in history as being important
  • v0.5.438-rete-plus-develop

    protected
    v0.5.438-rete-plus-develop: cherry-pick 3 high-value develop fixes
    
    Surgical pick of 3 commits from develop, validated locally:
    
    - 473cc0915 pre-transpose technosphere matrix to optimize MatrixCalculationGFM
      (Simon Greuter, 0592f3a34 on develop; 1 file, +2/-1, 7/7 unit tests pass)
    
    - 4f6315640 fix missing massive nutrients that should be set to 0 in
      attach food tag gfm
      (Simon Greuter, 1efef2700 on develop; 1 file, +19/-9, 7/7 unit tests pass)
    
    - f5576e7d8 Fix handling of DeletedLinkToUidProp in AddClientNodesGFM
      (Simon Greuter, 2476d9e9a on develop; 2 files, +17/-17, 36/36 unit tests pass)
    
    The 4th candidate (5568f41b3 "Skip invalidation for known non
    invalidating transient upserts" by Yannick Schubert) was NOT picked:
    it deletes lan_upsert_diff_classifier.py and replaces it with two new
    classifier files. On HEAD the file had been refactored away already,
    producing a modify/delete conflict that's not a surgical resolution.
    That commit needs proper integration design, deferred.
    
    Replica suite (5 dagster combined recipes) passes with identical
    error counts to v0.5.437 — no regression.
    
    Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
  • v0.5.437-rete-plus-develop

    protected
    v0.5.437-rete-plus-develop: fix missing_greenhouse_data regression on top-level FPFs (#192)
    
    Builds on v0.5.436 with one correctness fix:
    
    - ebb956e09 fix(greenhouse): skip emission on top-level FPFs without resolved location
    
    Closes the missing_greenhouse_data regression that v0.5.436 cluster
    validation (run f015ca59) showed was independent of #191 — present at
    dev counts 0/2/3/3/1/3/3/8/6/6 on combined_8..40 vs prod all-zero.
    
    Same RETE gate divergence as #191: ancestor_flow_location_resolved
    allow_none=True lets origin-less top-level FPFs through where
    develop's PropCheck(stop_at_settled=True) waits indefinitely. Fix
    extends the existing greenhouse run-time skip-if-no-inheritable-
    country guard to also cover AddClientNodesGapFillingWorker.
    
    Local: local_combined_recipe_3_ings missing_greenhouse_data 1 -> 0.
    All 5 replica tests pass. 23/23 unit tests pass.
    
    Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
  • v0.5.436-rete-plus-develop

    protected
    v0.5.436-rete-plus-develop: fix unknown_origin_location regression on top-level FPFs (#191)
    
    Builds on v0.5.435 with one correctness fix:
    
    - 297dc0b69 fix(rainforest): skip emission on top-level FPFs without resolved location
    
    Closes the unknown_origin_location regression that scaled with recipe
    size (cluster: dev 5/8/18/42 vs prod 1/0/2/2 on 4/8/16/40_ings).
    
    Root cause: rainforest's ancestor_flow_location_resolved gate uses
    allow_none=True, which lets top-level recipe FPFs through even when
    LocationGFM never fires on them (LocationGFM only runs when
    flow_location is a raw str; None never qualifies, so a FPF with no
    user-specified origin stays at None forever).
    
    Fix extends the existing Origin/WaterScarcity-duplicate skip-if-no-
    inheritable-location guard to also cover AddClientNodesGapFillingWorker.
    
    Local repro: combined_4 unknown_origin_location 3 -> 1 (matches develop).
    All RETE/GFM unit suites pass (20/20).
    
    Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
  • v0.5.435-rete-plus-develop

    protected
    v0.5.435-rete-plus-develop: β-A skip counters + measurement
    
    Builds on v0.5.434 with one telemetry change:
    
    - 7c308399b perf(rete): β-A skip counters + benchmark output of gate
      effectiveness
    
    Adds two SelectiveEvaluator counters (total_relational_alpha_skips_-
    disjoint, total_relational_alpha_skips_source_type) so we can see how
    much the β-A watched-property gate and the source-type filter actually
    save on real recipes. Surfaced via get_rete_stats() and printed in the
    two_origins benchmark output as a "gate effectiveness" ratio.
    
    Initial measurement (local, two_origins):
      Relational alpha evals:        21,068
      Rel. skips (disjoint):          4,304
      Rel. skips (src-type):            483
      Rel. gate effectiveness:        18.5%
    
    This closes the β-A measurement question (Phase 2 ship-or-revert
    decision = SHIP, already shipped via aca0c576a / 5ad7d4a57 / the
    network_builder.relational_alpha_watched wiring; just wasn't visible
    in stats output until now).
    
    Tasks #128 (alpha-by-node-type index) and #185 (cluster 2.4x gap)
    also closed in this turn — superseded by the measurement here and the
    cluster-noise finding respectively.
    
    Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
  • v0.5.434-rete-plus-develop

    protected
    v0.5.434-rete-plus-develop: fix stale xid->uid negative cache poisoning processing CO2
    
    Builds on v0.5.433 with one targeted correctness fix:
    
    - 7f90ba18d fix(processing): re-resolve stale xid->uid mappings poisoned
      by negative cache
    
    Root cause: the process-lifetime find_uid_by_xid cache (introduced in
    36348c871) cached negative lookups, but bulk_insert_xid_uid_mappings
    never invalidated them. The processing-seed path looked up every
    process xid before the Brightway rows existed -> cached None -> later
    inserts wrote the real rows but left the negative cache in place. The
    GFM kept loading ProcessingTagsAndId with uid=None, making
    load_brightway_node_and_subgraph bail out and silently dropping the
    processing CO2 contribution.
    
    Fix has two complementary parts:
    - pg_product_mgr.bulk_insert/bulk_delete_xid_uid_mappings now update
      the cache for affected keys instead of leaving stale entries.
    - processing_gfm.init_cache re-resolves trigger.uid from the stable
      xid when the cached uid is None, so a stale gfm_cache self-corrects
      on next init.
    
    Validation:
    - test_processing_gfm.py: 7 failed / 2 passed -> 9 passed
    - core gauntlet: 417 passed / 1 pre-existing failure (kWh
      self-reference unit conversion, unrelated to this change)
    
    Expected cluster effect: the per-recipe "1-2% lower CO2 than develop"
    on combined recipes should close to zero on any recipe that triggers
    a processing step. Cluster benchmark variance (2-4x noise on identical
    builds) still applies — verify with median over multiple runs, not a
    single-shot comparison.
    
    Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
  • v0.5.433-rete-plus-develop

    protected
    perf(db): batch required_matching node-prop writes (final per-flow DB roundtrip)
    
    The last un-batched per-flow DB roundtrip. MatchProductNameGFM.run()
    issued one UPDATE per unmatched flow via update_node_prop(
    required_matching, append=True) — ~7997 calls @2000ingr stress.
    Now buffered in NodeService keyed by root_node_uid and flushed in
    one bulk UPDATE at scheduler quiescence via the update_node_prop_bulk
    DB layer shipped in v0.5.432. In-memory PropListMutation stays
    immediate.
    
    This is the required_matching half reverted in v0.5.432. That revert
    was judged on an N=10 A/B of the already-flaky
    test_matching_and_cache_invalidation_complete_workflow. Re-verified
    independently at N=20: baseline 13/20 pass, this change 14/20 pass —
    statistically identical, the change does NOT worsen the flake.
    required_matching is safe to defer: consumed only by FUTURE
    calculations (graph reload) and the cleanup CLI, never within the
    same request; /apply and /update-automatching invalidate by explicit
    node_uid, not by reading required_matching.
    
    Validation:
    - two_origins CO2=1.2568, subrecipe CO2=0.0930 invariants hold
    - 120/120 broad gauntlet
    - 41/44 legacy_recipe_router (3 pre-existing batch flakes only)
    - N=20 flake A/B independently re-run: 13/20 baseline vs 14/20 change
    
    NOTE ON CLUSTER MEASUREMENT: the dagster cluster benchmark is too
    noisy for single-run per-tag comparison — two runs of the identical
    v0.5.432 build 5h apart differed 2-4x (combined_40: 5.34s vs 9.60s;
    develop baseline itself shifted 4.60s -> 3.64s). All perf claims in
    the v0.5.426-v0.5.433 series rest on deterministic local cProfile
    call-count reductions + stable correctness invariants, NOT on
    single cluster wall-time runs. A rigorous cluster A/B needs N>=10
    interleaved runs per build.
    
    Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
  • v0.5.432-rete-plus-develop

    protected
    perf(db): batch aggregated_cache writes in save_as_system_process
    
    save_as_system_process now issues one bulk UPDATE for all supply-node
    aggregated_cache writes instead of one roundtrip per node. New
    update_node_prop_bulk (UPDATE ... FROM unnest) handles both append
    modes; text[]+::jsonb cast avoids asyncpg list-collapse ambiguity.
    
    The required_matching half was implemented then reverted — deferring
    that DB write made a pre-existing flaky test timing-sensitive (A/B at
    N=10 confirmed the aggregated_cache-only change is flake-neutral:
    baseline 3/10, change 4/10). required_matching batching tracked
    separately for its own investigation.
    
    Validation: two_origins CO2=1.2568, subrecipe CO2=0.0930,
    120/121 broad gauntlet (1 pre-existing flake, A/B-confirmed unaffected).
  • v0.5.431-rete-plus-develop

    protected
    perf(db): batch add_matching_required DB roundtrips (8215 -> 3 per calc)
    
    cluster impact: ~5ms per saved roundtrip; combined_40 expected to drop
    several seconds since matching_required calls were a serial chain
    through MatchProductNameGFM.run(). Local cProfile confirms call count
    drop of 99.96%; local wall time is noise-bound because local Postgres
    roundtrip is microseconds.
    
    Validation:
    - two_origins CO2=1.2568
    - subrecipe canary CO2=0.0930
    - 114/115 broad gauntlet (the 1 flake is pre-existing, not caused by
      this change — confirmed via stash test: 3/5 pass baseline, 3/5 pass
      with change)
    - Dedupe fix for ON CONFLICT DO UPDATE collision verified
  • v0.5.430-rete-plus-develop

    protected
    perf(orchestrator): shallow clone in AddNodeMutation (-6.3% wall)
    
    AddNodeMutation.apply() previously used self.new_node.model_copy(deep=True)
    to materialize a node into the calc graph. cProfile on stress_scaling@2000ingr
    showed copy.deepcopy at 414k calls / 0.68s self-time = 4.2% of wall, plus
    a long tail of internal deepcopy work pushing total deepcopy-related cost
    to ~6% of wall.
    
    Replaced with a _shallow_clone_node helper. Safe because:
    1. Pydantic Props are immutable by contract — PropMutation REPLACES slots
       via super().__setattr__, never mutates in-place. Inner Prop data
       (e.g. gfm_state.worker_states dict) can be shared.
    2. New node gets fresh __dict__ so private slots like _calculation /
       _parent_nodes don't leak back to the caller.
    3. Each Prop slot gets a fresh Prop instance so set_owner_node_for_props
       doesn't rebind the source's Props.
    4. Iterates __dict__ (not model_fields_set) — required because
       inventory_importer/bw_importer trims model_fields_set on cached
       ElementaryResourceEmissionNodes; iterating model_fields_set would
       silently drop uid and break downstream add_edge.
    
    Measurements (stress_scaling@2000ingr, 3 runs):
    - copy.deepcopy ncalls: 414,310 → 12,025  (-97%)
    - copy.deepcopy tottime: 0.676s → 0.147s  (-78%)
    - Total wall time: 14.218s → 13.32s mean  (-6.3%)
    
    Investigation findings from parallel attempts (not committed):
    - Path D (asyncio yield frequency): the kqueue 17.4% figure was stale;
      current HEAD already runs kqueue at 1.5-2%. No further win available.
    - Path F (cache is_node_in_affected_subtree): function short-circuits
      in healthy workloads (0.029% wall, not 1.4%). Not worth caching.
    
    Validation:
    - test_benchmark_two_origins CO2=1.2568 invariant holds
    - test_calculation_with_subrecipe CO2=0.0930 invariant holds
    - 96/96 in broad gauntlet
    - 41/44 legacy_recipe_router (3 pre-existing batch flakes only)
  • v0.5.429-rete-plus-develop

    protected
    perf(rete): UWC Site #5 conversion + LocationGFM defensive fix
    
    Two new improvements beyond v0.5.428 (which already shipped Site #4
    + foundation):
    
    1. Site #5 (FlowNode no sub_nodes yet) converted from imperative
       100-reschedule loop to cancel+final_pass with per-factory
       cancel-count tracking. has_child trigger handles the success path;
       cancel-count-1 + final_pass dispatch handles the failure path
       (UWC.run() must still emit "No matching unit term found" for
       invalid units).
    
    2. LocationGFM._handle_conflicting_locations no longer unconditionally
       overwrites new_flow.amount with self.node.production_amount —
       skips when new_flow already has a QuantityProp amount.
       DuplicateNodeMutation (above) shallow-copies parent_node and
       inherits its amount; the unconditional overwrite silently
       re-stamped subrecipe-link FPFs' inherited 50g with the subrecipe's
       228g aggregate, scaling matrix contributions wrong. Latent under
       imperative (UWC hasn't committed yet); surfaces under any UWC #1+#2
       conversion. Defensive — correct regardless of UWC conversion path.
    
    UWC Sites #1+#2 still imperative — Path B (2026-05-12) investigation
    confirmed even fixing LocationGFM doesn't unlock conversion because of
    a downstream TransportDecision/TransportModeDistance chain that keys
    on UWC's timing.
    
    Investigation findings on Pydantic perf (not actionable):
    - node.__getattribute__ is 6.6% of wall in local tests
    - ENTIRELY due to VERIFY_NODE_READ_ONLY=True (dev/test default)
    - helm values-{dev,prod}.yaml set verify_node_read_only=false
    - Cluster overhead is purely RETE/orchestrator infrastructure
    
    Validation:
    - two_origins CO2=1.2568 invariant holds
    - test_calculation_with_subrecipe CO2=0.0930 invariant holds
    - 96/96 in broad gauntlet
    - 41/44 legacy_recipe_router (3 pre-existing batch flakes only)
  • v0.5.428-rete-plus-develop

    protected
    perf(rete): convert UWC Site #5 + document Sites #1+#2 blocker
    
    Site #5 (FlowNode no sub_nodes yet) converted from imperative
    100-reschedule loop to cancel+final_pass with per-factory cancel-count
    tracking. First can_run_now returns cancel; final_pass dispatcher
    rehabs and re-invokes; second can_run_now returns ready so UWC.run()
    executes against settled state. Has_child trigger handles the success
    case (sub_node materialized by LTAN).
    
    Orchestrator-side enablement: _execute_one and _cancel_worker add
    canceled pairs to _final_pass_pending when the GFM opts into
    final_pass. Inert for GFMs that don't opt in.
    
    Sites #1+#2 remain imperative. Root cause documented at c2027e5a7:
    LocationGFM unconditionally writes wrapper-flow.amount from
    self.node.production_amount, overwriting the 50g inherited from
    DuplicateNodeMutation. Imperative path: UWC hasn't committed yet,
    overwrite skipped, 50g preserved. Conversion path: UWC commits 228g
    early, LocationGFM overwrites 50g→228g, wrapper FPA derives wrong
    value. Targeted fix recovers the FPA write but a downstream chain
    of TransportDecision/TransportModeDistance ordering assumptions
    keeps CO2 drifting +0.6%. Full fix requires architectural change:
    move FPA-boundary aggregation into a peer GFM consuming
    gfm_completed facts.
    
    Validation:
    - two_origins CO2=1.2568 invariant holds.
    - test_calculation_with_subrecipe CO2=0.0930 invariant holds.
    - test_orchestration_with_food_product_flow_and_declaration: data
      error 'No matching unit term found' still emitted.
    - 87/87 in core orchestrator + benchmark + GFM + isolation tests.
    - 41/44 legacy_recipe_router (3 pre-existing batch flakes only).
    
    See docs/uwc-fpa-boundary-conversion-design.md.
  • v0.5.427-rete-plus-develop

    protected
    perf(rete): foundation + Site #4 UWC conversion
    
    Foundation fixes that unblock cancel+refire on DuplicateNodeMutation
    runtime nodes, plus UWC Site #4 (FlowNode sub_node wait) converted
    from imperative reschedule to cancel+final_pass.
    
    Three orchestrator-level fixes:
    1. DuplicateNodeMutation strips transient gfm_state entries from
       duplicates for an allow-list of GFMs (currently {UWC}).
    2. Quiescence fire-order: execute_final_pass_refires runs BEFORE
       _execute_final_gfms.
    3. execute_final_pass_refires gained cancel/finished rehab via
       clear_gfm_state_entry + reactivate_gfm_alphas.
    
    UWC Sites #1+#2 stay imperative — bidirectional FPA/FPF aggregation
    cycle (with AddClientNodes-created LinkingActivityNode wrappers)
    cannot be expressed as one-way refire triggers. Site #5 stays
    imperative — fall-through-to-ready timeout has no equivalent in
    cancel+refire.
    
    Cluster wins from v0.5.426 (Origin/IAE conversions) preserved.
    
    Validation:
    - test_benchmark_two_origins CO2=1.2568.
    - test_calculation_with_subrecipe CO2=0.0930.
    - 87/87 in core orchestrator + benchmark + GFM + isolation tests.
    - 41/44 legacy_recipe_router (3 pre-existing batch flakes only).
    
    See docs/uwc-fpa-boundary-conversion-design.md.
  • v0.5.426-rete-plus-develop

    protected
    perf(rete): Origin #8/#9 + IAE #11/#14 imperative→cancel+refire conversions
    
    Surviving conversions from the 6-site sequential attempt:
    - Origin #8/#9 (sub_nodes_recursive amount/has_child triggers, 426c8ac1c)
    - IAE #11/#14 (sub_nodes_recursive nutrient_values trigger, 23741dc33)
    
    Reverted (left imperative): UWC #4, #5, #1+#2 — all FPA-boundary
    sites where a single structural refire trigger is insufficient for
    the multi-event timing barrier the imperative wait was masking.
    UWC #4 collapsed subrecipe CO2 0.0931→0.0506; UWC #1+#2 drifted
    0.0930→0.0936; UWC #5 broke unit conversion in test_recipe_missing_lci.
    
    Validation:
    - two_origins CO2 invariant 1.2568: holds
    - 85/85 in core orchestrator + benchmark + GFM-specific gauntlet
    - 41/44 legacy_recipe_router (3 pre-existing batch flakes only)
    - subrecipe canary: green
  • v0.5.425-rete-plus-develop

    protected
    Revert v0.5.424 β-memory cache — cluster regression confirmed
    
    Two cluster runs of v0.5.424 (b9138673, 6ea91461) both showed a
    ~40-50% slowdown across all recipe sizes vs the v0.5.423 baseline:
    
      size       v0.5.423  v0.5.424 run1  v0.5.424 run2
      4_ings      1.65s     2.63s          2.63s
      16_ings     9.04s    11.88s         11.95s
      40_ings    22.65s    33.58s         33.80s
    
    This is the opposite of what local cProfile predicted (-12% cumtime
    on test_benchmark_two_origins). The discrepancy reveals that the
    local two_origins fixture isn't representative of cluster recipe
    behaviour: in test, alphas are activated few times per recipe and
    the cache hit rate is moot; on cluster combined_40 runs, the cache
    overhead per call (dict.get + dict.set + tuple alloc for the (gen,
    result) value) compounds across hundreds of thousands of activations
    faster than evaluate() itself would.
    
    Even combined_4 — which has very few related-alpha activations —
    went 1.65s → 2.63s. The slowdown affects orchestration globally,
    not just RelatedNodeAlpha. Possible mechanism: adding
    self._source_eval_cache = {} to AlphaNode subclass __init__ creates
    a per-instance dict on every alpha that's referenced on every
    activate call (via the override's `self._source_eval_cache.get(...)`).
    The instance attribute lookup on a subclass with frozen=False might
    trigger a dict resize or other overhead at scale.
    
    Tests stay green, CO2 invariant holds either way. The β-memory idea
    itself is sound (proven by PerRelatedNodeAlpha._source_eval_cache);
    this attempt to extend the same pattern to RelatedNodeAlpha and
    CrossNodePropertyAlpha was net-negative on the dev cluster shape.
    
    Reverts:
    - 7797eb2f9 perf(rete): per-sweep eval cache on CrossNodePropertyAlpha
    - 420f88d20 perf(rete): per-sweep eval cache on RelatedNodeAlpha (β-memory analog)
    
    The branch perf/beta-discrimination is preserved on origin for
    post-mortem reference. The Phase 0 falsifier (commit 99ded2911)
    remains shipped as diagnostic infrastructure; it correctly measured
    75.4% noop ratio but the noop short-circuit path turned out to be
    cheaper than the cache-hit path on cluster.
    
    Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
  • v0.5.424-rete-plus-develop

    protected
    RETE relational alpha β-memory (per-sweep eval cache)
    
    Cluster cProfile (combined_40_ings, eos v0.5.423) showed
    alpha_network.activate at 5.36s cumtime as the residual hot spot
    after the v0.5.420-423 micro-optimizations. Phase 0 falsifier
    (commit 99ded2911) measured RelatedNodeAlpha at 84.7% noop ratio
    on the two_origins fixture — 17K of 20K activations end with the
    predicate output unchanged. The existing _eval_cache short-circuits
    PROPAGATION but evaluate() itself still runs on every call.
    
    This release adds the textbook β-memory pattern (cache the join
    result keyed on a generation that advances when inputs change) to
    the two relational alpha types whose evaluate() bodies do
    non-trivial per-related-uid work:
    
    - 420f88d20  perf(rete): per-sweep eval cache on RelatedNodeAlpha
    - 7797eb2f9  perf(rete): per-sweep eval cache on CrossNodePropertyAlpha
    
    Mirrors the proven PerRelatedNodeAlpha._source_eval_cache pattern.
    Within a sweep the working memory's cache_generation is fixed, so
    repeat activations on the same source uid would walk the same
    related set, do the same filter loop, produce the same quantifier
    reduction. Cache short-circuits before evaluate() runs.
    
    Local cProfile (test_benchmark_two_origins): alpha.activate calls
    53K → 33K (-38%); cProfile total 1.295s → 1.146s (-12% cumtime).
    Wall-clock at this fixture size is within macOS scheduler noise;
    combined_40_ings is the target.
    
    Cancel-rehab cleanup is automatic — `reset_gfm_activation` already
    pops `_source_eval_cache` via hasattr branch.
    
    Tests: 532/532 green. CO2 invariant test_benchmark_two_origins=1.2568
    holds. The branch perf/beta-discrimination was created to scope this
    work and is preserved on origin.
    
    Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>