perf(db): batch add_matching_required DB roundtrips (8215 -> 3 per calc)

cluster impact: ~5ms per saved roundtrip; combined_40 expected to drop
several seconds since matching_required calls were a serial chain
through MatchProductNameGFM.run(). Local cProfile confirms call count
drop of 99.96%; local wall time is noise-bound because local Postgres
roundtrip is microseconds.

Validation:
- two_origins CO2=1.2568
- subrecipe canary CO2=0.0930
- 114/115 broad gauntlet (the 1 flake is pre-existing, not caused by
  this change — confirmed via stash test: 3/5 pass baseline, 3/5 pass
  with change)
- Dedupe fix for ON CONFLICT DO UPDATE collision verified