perf(orchestrator): shallow clone in AddNodeMutation (-6.3% wall) AddNodeMutation.apply() previously used self.new_node.model_copy(deep=True) to materialize a node into the calc graph. cProfile on stress_scaling@2000ingr showed copy.deepcopy at 414k calls / 0.68s self-time = 4.2% of wall, plus a long tail of internal deepcopy work pushing total deepcopy-related cost to ~6% of wall. Replaced with a _shallow_clone_node helper. Safe because: 1. Pydantic Props are immutable by contract — PropMutation REPLACES slots via super().__setattr__, never mutates in-place. Inner Prop data (e.g. gfm_state.worker_states dict) can be shared. 2. New node gets fresh __dict__ so private slots like _calculation / _parent_nodes don't leak back to the caller. 3. Each Prop slot gets a fresh Prop instance so set_owner_node_for_props doesn't rebind the source's Props. 4. Iterates __dict__ (not model_fields_set) — required because inventory_importer/bw_importer trims model_fields_set on cached ElementaryResourceEmissionNodes; iterating model_fields_set would silently drop uid and break downstream add_edge. Measurements (stress_scaling@2000ingr, 3 runs): - copy.deepcopy ncalls: 414,310 → 12,025 (-97%) - copy.deepcopy tottime: 0.676s → 0.147s (-78%) - Total wall time: 14.218s → 13.32s mean (-6.3%) Investigation findings from parallel attempts (not committed): - Path D (asyncio yield frequency): the kqueue 17.4% figure was stale; current HEAD already runs kqueue at 1.5-2%. No further win available. - Path F (cache is_node_in_affected_subtree): function short-circuits in healthy workloads (0.029% wall, not 1.4%). Not worth caching. Validation: - test_benchmark_two_origins CO2=1.2568 invariant holds - test_calculation_with_subrecipe CO2=0.0930 invariant holds - 96/96 in broad gauntlet - 41/44 legacy_recipe_router (3 pre-existing batch flakes only)