Tags give the ability to mark specific points in history as being important
-
-
-
v0.5.423-rete-plus-develop
protected07a3122a · ·RETE working memory redesign — typed property indexes (the hard risk) Cluster cProfile (combined_40_ings, v0.5.421) flagged dict.get on _facts as the #1 hot spot: 5.85M calls / 1.26s tottime. The cost was per-call 3-tuple allocation + hashing in the canonical `(uid, fact_type, key) -> WorkingMemoryFact` store, plus Fact wrapper unwrapping at every read. This release adds typed sub-indexes that mirror the property_value and property_type slices of _facts: _property_values: dict[node_uid, dict[name, value]] _property_types : dict[node_uid, dict[name, type_name]] Hot-path reads skip tuple construction and Fact wrapping: has_property(uid, name) — 2x dict.get + `in` get_property_value(uid, name) — 2x dict.get get_property_type(uid, name) — 2x dict.get Maintained alongside _facts in assert_fact, retract_fact, retract_all_for_node, and clear. Contract test (18 cases) pins: - read-side semantics (incl. falsy-value preservation: 0/''/False) - mirror consistency across assert / retract / retract_all / clear - has_property MUST distinguish property_value from property_type - get_fact() compat for non-property fact_types This is the "hard risk" optimization step — touches the core storage contract of WM. Validated by 532/532 green tests including CO2 invariant on test_benchmark_two_origins. Local impact invisible (two_origins volume too low). Expected cluster impact 0.4-0.8s on the 22.4s combined_40_ings baseline. Cluster cProfile after deploy will tell us whether this also reduces the cumtime for upstream callers (alpha_network.evaluate 3.46s, conditions.check 1.82s). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
-
v0.5.422-rete-plus-develop
protecteda8b122fc · ·RETE working-memory hot-path tweaks (cluster v0.5.421 cProfile follow-up) v0.5.421 cluster cProfile (combined_40_ings, 22.1s recipe wall) showed two new top hot spots after the prior optimization rounds removed the obscuring costs: - working_memory.get_parents: 196K calls / 1.03s tottime (#4 hot spot) - get_cached_walk: 21K calls/recipe with redundant per-call gen check Two fixes in this release: All callers iterate or use .update()/.extend(); none mutate the returned list. The list(parents) copy on every call was the dominant cost. Return the underlying set by reference (read-only contract documented). Empty case shares a frozenset. bump_cache_generation() always calls _walk_cache.clear(), so surviving entries are valid by construction. The per-call generation comparison in get_cached_walk / cache_walk was dead defensive code. Removed; contract test pins the invariant. Local: -2.4% to -3.3% wall on two_origins. Tests: 514/514 green (orchestrator + benchmark). Skipped: flow-node-flag-bitmask experiment (perf/flow-node-flag-bitmask branch, parked) — agent measured -0.0% wall, not worth the indirection. Expected cluster gain: 1-2s on 22.1s combined_40_ings baseline. The get_parents win in particular is well-targeted at v0.5.421's #4 hot spot. After this tag, the next round of optimization needs to attack working memory storage redesign (5.85M dict.get calls / 1.26s). That's a multi-day project, not a one-shot agent task. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
-
v0.5.421-rete-plus-develop
protected7153e588 · ·RETE second-wave optimizations (cluster-cProfile-driven) Builds on v0.5.420 baseline (combined_40_ings recipe wall 22.4s, dev cluster). This release ships 4 additional optimizations targeting v0.5.420's top remaining hot spots: H1: prop dataclass plan caching (83eaa9767) unvalidated_construct + __post_init__ iterate __dataclass_fields__.items() on every Prop construction. Hoist to per-class plan cached at first touch. Local: -8.3% median two_origins, -62% items() count. H2: pydantic-class isinstance bypass on RETE hot path (7153e5887) 39.7% of isinstance calls on RETE hot path were pydantic Node/Condition checks (slow __instancecheck__ via _abc_instancecheck). Replace with type-name MRO frozenset cached per class. Class-level _or_filter_attr dispatch in alpha_network.py:1445 replaces 5-way isinstance branch. Local: -39.7% isinstance count on two_origins. H3: per-GFM alpha index for reset_gfm_activation (a01bfdee5) 6284 calls/recipe on combined_40 was iterating all alphas with prefix string compare. Build _alphas_by_gfm: dict[str, list[AlphaNode]] once at register_gfm time. Reset reads precomputed list instead of scanning. Local: -67% function-level cumtime; cluster impact at 6284 × 50µs scale. H4: pydantic __getattr__ bypass in NodeAttribute(Condition|Alpha) (d32ceccc4) 290K calls/recipe on combined_40 to pydantic main.__getattr__ at 1.58µs each (1.07s cumtime). NodeAttribute(Condition|Alpha) check `is_*` markers set via object.__setattr__; replace getattr with node.__dict__.get(name). Local: -42.3% __getattr__ count, -11.5% median wall on two_origins. Tests: 510/510 green (orchestrator + benchmark suites + 7 new contract tests for reset_gfm_activation). CO2 stability preserved (test_benchmark_two_origins still asserts 1.2568). Expected cluster gain (combined_40_ings): 3-5s off the 22.4s v0.5.420 baseline. cProfile decoded from the next dagster run will confirm. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
-
v0.5.420-rete-plus-develop
protected06b932bc · ·RETE hot-path optimizations driven by cluster cProfile evidence Cluster cProfile on combined_40_ings (recipe wall 23.6s on dev, v0.5.419) showed orchestration at 82% of wall time, with these top hot spots: ncalls tottime function 897K 1.23s alpha_network.activate 2.6M 0.74s isinstance 1.76M 0.79s hasattr 974K 0.60s getattr 403K 0.56s alpha_network._get_related_at_depth 1.39M 0.44s _abc_instancecheck (driven by isinstance) This release ships three changes targeting these: 1. _kind/_is_compound flags (8c8923aca) Replace per-callback isinstance + set-membership with class-level bit flags set at network-build time. Eliminates 2.6M isinstance calls on combined_40 → 0 in selective_evaluator hot path. 2. Per-batch BFS walk cache (53fd00f86) Cache RelatedNodeAlpha._get_related_at_depth results within a single on_facts_changed sweep. Measured 91.3% hit rate on two_origins fixture; on combined_40 the BFS cumtime is 2.4s → should drop to ~0.2s. 3. Drop hasattr/getattr probes on alpha hot path (06b932bc7) Hoist condition fields into PerRelatedNodeAlpha.__init__ (frozen dataclass — sound). Drop dead hasattr guards for methods always defined on Node base class. Measured -29.9% hasattr, -21.4% getattr on two_origins; -2.6% wall time. Tests: 503/503 green (orchestrator + benchmark suites). Local two_origins: 3.05-3.20s (within run-to-run noise). Expected cluster gain: 3-5s off the 23.6s combined_40_ings baseline, biggest impact on the 18.6s outside-GFM time (selective_evaluator, relational fan-out, alpha activation). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
-
v0.5.419-rete-plus-develop
protected15851876 · ·Performance + diagnostics for combined recipe regression Already-committed wins on this branch since v0.5.418: - per-node alpha activation in on_facts_changed (a5eaffef1) - skip asyncpg reset query on pool release (2a3fefff6) - process-lifetime cache for find_uid_by_xid + find_access_group (36348c871, ee87a8769) - inline cProfile + per-GFM CPU/wall + throttle + pool stats (f02fc1611) - gate cProfile on enable_cprofile_inline, off by default (be65d76b2) - cluster cProfile capture script with prod denylist + dev allowlist (73617a4be, 53d2fd2ef) - contract test pinning DFU per-product dedup on Origin-split FPFs (158518767) cProfile activation requires workflows MR 1054 + tag v0.1.474. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
-
v0.5.418-rete-plus-develop
protectedbe65d76b · ·Phase 5: gate cProfile on enable_cprofile_inline (defaults False — fixes overhead leak in dagster benchmarks)
-
v0.5.417-rete-plus-develop
protectedee87a876 · ·Phase 4: cache find_or_create_uid_by_xid (saves 5 more queries/recipe)
-
v0.5.416-rete-plus-develop
protected36348c87 · ·Phase 3: cache xid→uid and node→access_group_uid (saves 10 queries/recipe)
-
v0.5.415-rete-plus-develop
protected2a3fefff · ·Phase 2: skip asyncpg reset query — 50% fewer DB roundtrips per recipe
-
v0.5.414-rete-plus-develop
protecteda5eaffef · ·Phase 1: per-node alpha activation (5% local wall reduction at 8/16-ing scale)
-
v0.5.413-rete-plus-develop
protectedf02fc161 · ·Add perf instrumentation: cProfile + per-GFM CPU/wall + throttle samples + pg pool stats inline
-
v0.5.412-rete-plus-develop
protected2714cb39 · ·Revert v0.5.411 IAE skip-list (slower in dagster benchmark)
-
v0.5.411-rete-plus-develop
protecteda66a47c2 · ·Skip IAE on structural FPAs (TMD/WS/NSD/Processing/Greenhouse) - fixes speed regression cascade
-
-
-
v0.5.410-rete-plus-develop
protectedea537944 · ·Narrow Rainforest+Greenhouse skip to Origin/WS-duplicate FPFs (full core suite green)
-
v0.5.409-rete-plus-develop
protected472faaf2 · ·Idempotent Rainforest + narrow Greenhouse skip on duplicate FPFs
-