Tags give the ability to mark specific points in history as being important
-
v0.5.438-rete-plus-develop
protectedf5576e7d · ·v0.5.438-rete-plus-develop: cherry-pick 3 high-value develop fixes Surgical pick of 3 commits from develop, validated locally: - 473cc0915 pre-transpose technosphere matrix to optimize MatrixCalculationGFM (Simon Greuter, 0592f3a34 on develop; 1 file, +2/-1, 7/7 unit tests pass) - 4f6315640 fix missing massive nutrients that should be set to 0 in attach food tag gfm (Simon Greuter, 1efef2700 on develop; 1 file, +19/-9, 7/7 unit tests pass) - f5576e7d8 Fix handling of DeletedLinkToUidProp in AddClientNodesGFM (Simon Greuter, 2476d9e9a on develop; 2 files, +17/-17, 36/36 unit tests pass) The 4th candidate (5568f41b3 "Skip invalidation for known non invalidating transient upserts" by Yannick Schubert) was NOT picked: it deletes lan_upsert_diff_classifier.py and replaces it with two new classifier files. On HEAD the file had been refactored away already, producing a modify/delete conflict that's not a surgical resolution. That commit needs proper integration design, deferred. Replica suite (5 dagster combined recipes) passes with identical error counts to v0.5.437 — no regression. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
-
v0.5.437-rete-plus-develop
protectedebb956e0 · ·v0.5.437-rete-plus-develop: fix missing_greenhouse_data regression on top-level FPFs (#192) Builds on v0.5.436 with one correctness fix: - ebb956e09 fix(greenhouse): skip emission on top-level FPFs without resolved location Closes the missing_greenhouse_data regression that v0.5.436 cluster validation (run f015ca59) showed was independent of #191 — present at dev counts 0/2/3/3/1/3/3/8/6/6 on combined_8..40 vs prod all-zero. Same RETE gate divergence as #191: ancestor_flow_location_resolved allow_none=True lets origin-less top-level FPFs through where develop's PropCheck(stop_at_settled=True) waits indefinitely. Fix extends the existing greenhouse run-time skip-if-no-inheritable- country guard to also cover AddClientNodesGapFillingWorker. Local: local_combined_recipe_3_ings missing_greenhouse_data 1 -> 0. All 5 replica tests pass. 23/23 unit tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
-
v0.5.436-rete-plus-develop
protected297dc0b6 · ·v0.5.436-rete-plus-develop: fix unknown_origin_location regression on top-level FPFs (#191) Builds on v0.5.435 with one correctness fix: - 297dc0b69 fix(rainforest): skip emission on top-level FPFs without resolved location Closes the unknown_origin_location regression that scaled with recipe size (cluster: dev 5/8/18/42 vs prod 1/0/2/2 on 4/8/16/40_ings). Root cause: rainforest's ancestor_flow_location_resolved gate uses allow_none=True, which lets top-level recipe FPFs through even when LocationGFM never fires on them (LocationGFM only runs when flow_location is a raw str; None never qualifies, so a FPF with no user-specified origin stays at None forever). Fix extends the existing Origin/WaterScarcity-duplicate skip-if-no- inheritable-location guard to also cover AddClientNodesGapFillingWorker. Local repro: combined_4 unknown_origin_location 3 -> 1 (matches develop). All RETE/GFM unit suites pass (20/20). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
-
v0.5.435-rete-plus-develop
protected7c308399 · ·v0.5.435-rete-plus-develop: β-A skip counters + measurement Builds on v0.5.434 with one telemetry change: - 7c308399b perf(rete): β-A skip counters + benchmark output of gate effectiveness Adds two SelectiveEvaluator counters (total_relational_alpha_skips_- disjoint, total_relational_alpha_skips_source_type) so we can see how much the β-A watched-property gate and the source-type filter actually save on real recipes. Surfaced via get_rete_stats() and printed in the two_origins benchmark output as a "gate effectiveness" ratio. Initial measurement (local, two_origins): Relational alpha evals: 21,068 Rel. skips (disjoint): 4,304 Rel. skips (src-type): 483 Rel. gate effectiveness: 18.5% This closes the β-A measurement question (Phase 2 ship-or-revert decision = SHIP, already shipped via aca0c576a / 5ad7d4a57 / the network_builder.relational_alpha_watched wiring; just wasn't visible in stats output until now). Tasks #128 (alpha-by-node-type index) and #185 (cluster 2.4x gap) also closed in this turn — superseded by the measurement here and the cluster-noise finding respectively. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
-
v0.5.434-rete-plus-develop
protected7f90ba18 · ·v0.5.434-rete-plus-develop: fix stale xid->uid negative cache poisoning processing CO2 Builds on v0.5.433 with one targeted correctness fix: - 7f90ba18d fix(processing): re-resolve stale xid->uid mappings poisoned by negative cache Root cause: the process-lifetime find_uid_by_xid cache (introduced in 36348c871) cached negative lookups, but bulk_insert_xid_uid_mappings never invalidated them. The processing-seed path looked up every process xid before the Brightway rows existed -> cached None -> later inserts wrote the real rows but left the negative cache in place. The GFM kept loading ProcessingTagsAndId with uid=None, making load_brightway_node_and_subgraph bail out and silently dropping the processing CO2 contribution. Fix has two complementary parts: - pg_product_mgr.bulk_insert/bulk_delete_xid_uid_mappings now update the cache for affected keys instead of leaving stale entries. - processing_gfm.init_cache re-resolves trigger.uid from the stable xid when the cached uid is None, so a stale gfm_cache self-corrects on next init. Validation: - test_processing_gfm.py: 7 failed / 2 passed -> 9 passed - core gauntlet: 417 passed / 1 pre-existing failure (kWh self-reference unit conversion, unrelated to this change) Expected cluster effect: the per-recipe "1-2% lower CO2 than develop" on combined recipes should close to zero on any recipe that triggers a processing step. Cluster benchmark variance (2-4x noise on identical builds) still applies — verify with median over multiple runs, not a single-shot comparison. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
-
-
-
v0.5.433-rete-plus-develop
protected7f6ad159 · ·perf(db): batch required_matching node-prop writes (final per-flow DB roundtrip) The last un-batched per-flow DB roundtrip. MatchProductNameGFM.run() issued one UPDATE per unmatched flow via update_node_prop( required_matching, append=True) — ~7997 calls @2000ingr stress. Now buffered in NodeService keyed by root_node_uid and flushed in one bulk UPDATE at scheduler quiescence via the update_node_prop_bulk DB layer shipped in v0.5.432. In-memory PropListMutation stays immediate. This is the required_matching half reverted in v0.5.432. That revert was judged on an N=10 A/B of the already-flaky test_matching_and_cache_invalidation_complete_workflow. Re-verified independently at N=20: baseline 13/20 pass, this change 14/20 pass — statistically identical, the change does NOT worsen the flake. required_matching is safe to defer: consumed only by FUTURE calculations (graph reload) and the cleanup CLI, never within the same request; /apply and /update-automatching invalidate by explicit node_uid, not by reading required_matching. Validation: - two_origins CO2=1.2568, subrecipe CO2=0.0930 invariants hold - 120/120 broad gauntlet - 41/44 legacy_recipe_router (3 pre-existing batch flakes only) - N=20 flake A/B independently re-run: 13/20 baseline vs 14/20 change NOTE ON CLUSTER MEASUREMENT: the dagster cluster benchmark is too noisy for single-run per-tag comparison — two runs of the identical v0.5.432 build 5h apart differed 2-4x (combined_40: 5.34s vs 9.60s; develop baseline itself shifted 4.60s -> 3.64s). All perf claims in the v0.5.426-v0.5.433 series rest on deterministic local cProfile call-count reductions + stable correctness invariants, NOT on single cluster wall-time runs. A rigorous cluster A/B needs N>=10 interleaved runs per build. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
-
-
v0.5.432-rete-plus-develop
protectedb17d9f12 · ·perf(db): batch aggregated_cache writes in save_as_system_process save_as_system_process now issues one bulk UPDATE for all supply-node aggregated_cache writes instead of one roundtrip per node. New update_node_prop_bulk (UPDATE ... FROM unnest) handles both append modes; text[]+::jsonb cast avoids asyncpg list-collapse ambiguity. The required_matching half was implemented then reverted — deferring that DB write made a pre-existing flaky test timing-sensitive (A/B at N=10 confirmed the aggregated_cache-only change is flake-neutral: baseline 3/10, change 4/10). required_matching batching tracked separately for its own investigation. Validation: two_origins CO2=1.2568, subrecipe CO2=0.0930, 120/121 broad gauntlet (1 pre-existing flake, A/B-confirmed unaffected).
-
v0.5.431-rete-plus-develop
protected28cfcfde · ·perf(db): batch add_matching_required DB roundtrips (8215 -> 3 per calc) cluster impact: ~5ms per saved roundtrip; combined_40 expected to drop several seconds since matching_required calls were a serial chain through MatchProductNameGFM.run(). Local cProfile confirms call count drop of 99.96%; local wall time is noise-bound because local Postgres roundtrip is microseconds. Validation: - two_origins CO2=1.2568 - subrecipe canary CO2=0.0930 - 114/115 broad gauntlet (the 1 flake is pre-existing, not caused by this change — confirmed via stash test: 3/5 pass baseline, 3/5 pass with change) - Dedupe fix for ON CONFLICT DO UPDATE collision verified
-
v0.5.430-rete-plus-develop
protected8f3f88bf · ·perf(orchestrator): shallow clone in AddNodeMutation (-6.3% wall) AddNodeMutation.apply() previously used self.new_node.model_copy(deep=True) to materialize a node into the calc graph. cProfile on stress_scaling@2000ingr showed copy.deepcopy at 414k calls / 0.68s self-time = 4.2% of wall, plus a long tail of internal deepcopy work pushing total deepcopy-related cost to ~6% of wall. Replaced with a _shallow_clone_node helper. Safe because: 1. Pydantic Props are immutable by contract — PropMutation REPLACES slots via super().__setattr__, never mutates in-place. Inner Prop data (e.g. gfm_state.worker_states dict) can be shared. 2. New node gets fresh __dict__ so private slots like _calculation / _parent_nodes don't leak back to the caller. 3. Each Prop slot gets a fresh Prop instance so set_owner_node_for_props doesn't rebind the source's Props. 4. Iterates __dict__ (not model_fields_set) — required because inventory_importer/bw_importer trims model_fields_set on cached ElementaryResourceEmissionNodes; iterating model_fields_set would silently drop uid and break downstream add_edge. Measurements (stress_scaling@2000ingr, 3 runs): - copy.deepcopy ncalls: 414,310 → 12,025 (-97%) - copy.deepcopy tottime: 0.676s → 0.147s (-78%) - Total wall time: 14.218s → 13.32s mean (-6.3%) Investigation findings from parallel attempts (not committed): - Path D (asyncio yield frequency): the kqueue 17.4% figure was stale; current HEAD already runs kqueue at 1.5-2%. No further win available. - Path F (cache is_node_in_affected_subtree): function short-circuits in healthy workloads (0.029% wall, not 1.4%). Not worth caching. Validation: - test_benchmark_two_origins CO2=1.2568 invariant holds - test_calculation_with_subrecipe CO2=0.0930 invariant holds - 96/96 in broad gauntlet - 41/44 legacy_recipe_router (3 pre-existing batch flakes only)
-
-
v0.5.429-rete-plus-develop
protectedf7ba4b54 · ·perf(rete): UWC Site #5 conversion + LocationGFM defensive fix Two new improvements beyond v0.5.428 (which already shipped Site #4 + foundation): 1. Site #5 (FlowNode no sub_nodes yet) converted from imperative 100-reschedule loop to cancel+final_pass with per-factory cancel-count tracking. has_child trigger handles the success path; cancel-count-1 + final_pass dispatch handles the failure path (UWC.run() must still emit "No matching unit term found" for invalid units). 2. LocationGFM._handle_conflicting_locations no longer unconditionally overwrites new_flow.amount with self.node.production_amount — skips when new_flow already has a QuantityProp amount. DuplicateNodeMutation (above) shallow-copies parent_node and inherits its amount; the unconditional overwrite silently re-stamped subrecipe-link FPFs' inherited 50g with the subrecipe's 228g aggregate, scaling matrix contributions wrong. Latent under imperative (UWC hasn't committed yet); surfaces under any UWC #1+#2 conversion. Defensive — correct regardless of UWC conversion path. UWC Sites #1+#2 still imperative — Path B (2026-05-12) investigation confirmed even fixing LocationGFM doesn't unlock conversion because of a downstream TransportDecision/TransportModeDistance chain that keys on UWC's timing. Investigation findings on Pydantic perf (not actionable): - node.__getattribute__ is 6.6% of wall in local tests - ENTIRELY due to VERIFY_NODE_READ_ONLY=True (dev/test default) - helm values-{dev,prod}.yaml set verify_node_read_only=false - Cluster overhead is purely RETE/orchestrator infrastructure Validation: - two_origins CO2=1.2568 invariant holds - test_calculation_with_subrecipe CO2=0.0930 invariant holds - 96/96 in broad gauntlet - 41/44 legacy_recipe_router (3 pre-existing batch flakes only) -
v0.5.428-rete-plus-develop
protectedc2027e5a · ·perf(rete): convert UWC Site #5 + document Sites #1+#2 blocker Site #5 (FlowNode no sub_nodes yet) converted from imperative 100-reschedule loop to cancel+final_pass with per-factory cancel-count tracking. First can_run_now returns cancel; final_pass dispatcher rehabs and re-invokes; second can_run_now returns ready so UWC.run() executes against settled state. Has_child trigger handles the success case (sub_node materialized by LTAN). Orchestrator-side enablement: _execute_one and _cancel_worker add canceled pairs to _final_pass_pending when the GFM opts into final_pass. Inert for GFMs that don't opt in. Sites #1+#2 remain imperative. Root cause documented at c2027e5a7: LocationGFM unconditionally writes wrapper-flow.amount from self.node.production_amount, overwriting the 50g inherited from DuplicateNodeMutation. Imperative path: UWC hasn't committed yet, overwrite skipped, 50g preserved. Conversion path: UWC commits 228g early, LocationGFM overwrites 50g→228g, wrapper FPA derives wrong value. Targeted fix recovers the FPA write but a downstream chain of TransportDecision/TransportModeDistance ordering assumptions keeps CO2 drifting +0.6%. Full fix requires architectural change: move FPA-boundary aggregation into a peer GFM consuming gfm_completed facts. Validation: - two_origins CO2=1.2568 invariant holds. - test_calculation_with_subrecipe CO2=0.0930 invariant holds. - test_orchestration_with_food_product_flow_and_declaration: data error 'No matching unit term found' still emitted. - 87/87 in core orchestrator + benchmark + GFM + isolation tests. - 41/44 legacy_recipe_router (3 pre-existing batch flakes only). See docs/uwc-fpa-boundary-conversion-design.md.
-
v0.5.427-rete-plus-develop
protectedee1aa9c9 · ·perf(rete): foundation + Site #4 UWC conversion Foundation fixes that unblock cancel+refire on DuplicateNodeMutation runtime nodes, plus UWC Site #4 (FlowNode sub_node wait) converted from imperative reschedule to cancel+final_pass. Three orchestrator-level fixes: 1. DuplicateNodeMutation strips transient gfm_state entries from duplicates for an allow-list of GFMs (currently {UWC}). 2. Quiescence fire-order: execute_final_pass_refires runs BEFORE _execute_final_gfms. 3. execute_final_pass_refires gained cancel/finished rehab via clear_gfm_state_entry + reactivate_gfm_alphas. UWC Sites #1+#2 stay imperative — bidirectional FPA/FPF aggregation cycle (with AddClientNodes-created LinkingActivityNode wrappers) cannot be expressed as one-way refire triggers. Site #5 stays imperative — fall-through-to-ready timeout has no equivalent in cancel+refire. Cluster wins from v0.5.426 (Origin/IAE conversions) preserved. Validation: - test_benchmark_two_origins CO2=1.2568. - test_calculation_with_subrecipe CO2=0.0930. - 87/87 in core orchestrator + benchmark + GFM + isolation tests. - 41/44 legacy_recipe_router (3 pre-existing batch flakes only). See docs/uwc-fpa-boundary-conversion-design.md. -
v0.5.426-rete-plus-develop
protected23741dc3 · ·perf(rete): Origin #8/#9 + IAE #11/#14 imperative→cancel+refire conversions Surviving conversions from the 6-site sequential attempt: - Origin #8/#9 (sub_nodes_recursive amount/has_child triggers, 426c8ac1c) - IAE #11/#14 (sub_nodes_recursive nutrient_values trigger, 23741dc33) Reverted (left imperative): UWC #4, #5, #1+#2 — all FPA-boundary sites where a single structural refire trigger is insufficient for the multi-event timing barrier the imperative wait was masking. UWC #4 collapsed subrecipe CO2 0.0931→0.0506; UWC #1+#2 drifted 0.0930→0.0936; UWC #5 broke unit conversion in test_recipe_missing_lci. Validation: - two_origins CO2 invariant 1.2568: holds - 85/85 in core orchestrator + benchmark + GFM-specific gauntlet - 41/44 legacy_recipe_router (3 pre-existing batch flakes only) - subrecipe canary: green
-
v0.5.425-rete-plus-develop
protectedf61861ce · ·Revert v0.5.424 β-memory cache — cluster regression confirmed Two cluster runs of v0.5.424 (b9138673, 6ea91461) both showed a ~40-50% slowdown across all recipe sizes vs the v0.5.423 baseline: size v0.5.423 v0.5.424 run1 v0.5.424 run2 4_ings 1.65s 2.63s 2.63s 16_ings 9.04s 11.88s 11.95s 40_ings 22.65s 33.58s 33.80s This is the opposite of what local cProfile predicted (-12% cumtime on test_benchmark_two_origins). The discrepancy reveals that the local two_origins fixture isn't representative of cluster recipe behaviour: in test, alphas are activated few times per recipe and the cache hit rate is moot; on cluster combined_40 runs, the cache overhead per call (dict.get + dict.set + tuple alloc for the (gen, result) value) compounds across hundreds of thousands of activations faster than evaluate() itself would. Even combined_4 — which has very few related-alpha activations — went 1.65s → 2.63s. The slowdown affects orchestration globally, not just RelatedNodeAlpha. Possible mechanism: adding self._source_eval_cache = {} to AlphaNode subclass __init__ creates a per-instance dict on every alpha that's referenced on every activate call (via the override's `self._source_eval_cache.get(...)`). The instance attribute lookup on a subclass with frozen=False might trigger a dict resize or other overhead at scale. Tests stay green, CO2 invariant holds either way. The β-memory idea itself is sound (proven by PerRelatedNodeAlpha._source_eval_cache); this attempt to extend the same pattern to RelatedNodeAlpha and CrossNodePropertyAlpha was net-negative on the dev cluster shape. Reverts: - 7797eb2f9 perf(rete): per-sweep eval cache on CrossNodePropertyAlpha - 420f88d20 perf(rete): per-sweep eval cache on RelatedNodeAlpha (β-memory analog) The branch perf/beta-discrimination is preserved on origin for post-mortem reference. The Phase 0 falsifier (commit 99ded2911) remains shipped as diagnostic infrastructure; it correctly measured 75.4% noop ratio but the noop short-circuit path turned out to be cheaper than the cache-hit path on cluster. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> -
v0.5.424-rete-plus-develop
protected5aab37f5 · ·RETE relational alpha β-memory (per-sweep eval cache) Cluster cProfile (combined_40_ings, eos v0.5.423) showed alpha_network.activate at 5.36s cumtime as the residual hot spot after the v0.5.420-423 micro-optimizations. Phase 0 falsifier (commit 99ded2911) measured RelatedNodeAlpha at 84.7% noop ratio on the two_origins fixture — 17K of 20K activations end with the predicate output unchanged. The existing _eval_cache short-circuits PROPAGATION but evaluate() itself still runs on every call. This release adds the textbook β-memory pattern (cache the join result keyed on a generation that advances when inputs change) to the two relational alpha types whose evaluate() bodies do non-trivial per-related-uid work: - 420f88d20 perf(rete): per-sweep eval cache on RelatedNodeAlpha - 7797eb2f9 perf(rete): per-sweep eval cache on CrossNodePropertyAlpha Mirrors the proven PerRelatedNodeAlpha._source_eval_cache pattern. Within a sweep the working memory's cache_generation is fixed, so repeat activations on the same source uid would walk the same related set, do the same filter loop, produce the same quantifier reduction. Cache short-circuits before evaluate() runs. Local cProfile (test_benchmark_two_origins): alpha.activate calls 53K → 33K (-38%); cProfile total 1.295s → 1.146s (-12% cumtime). Wall-clock at this fixture size is within macOS scheduler noise; combined_40_ings is the target. Cancel-rehab cleanup is automatic — `reset_gfm_activation` already pops `_source_eval_cache` via hasattr branch. Tests: 532/532 green. CO2 invariant test_benchmark_two_origins=1.2568 holds. The branch perf/beta-discrimination was created to scope this work and is preserved on origin. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
-