feat(indexer): add plan module with AST, codegen, and ontology-driven pipeline plans
What does this MR do and why?
MR 2 of the SDLC v2 integration plan. Adds the structured query pipeline types that will replace the string-templated SQL in prepare.rs. Nothing is wired in yet; this is purely additive with no behavior change.
New files
plan/ast.rs — Minimal SQL AST (Query, Expr, SelectExpr, TableRef, Op). Only models what the ETL pipeline needs. Expr::Raw is the escape hatch for ClickHouse-specific fragments.
plan/codegen.rs — Walks the AST and emits SQL strings.
plan/from_ontology.rs — Builds PipelinePlan structs from ontology YAML. Handles both Table and Query ETL types, generates extract queries with watermark conditions, node transforms (int enum CASE expressions, column renaming), and FK edge transforms (multi-value delimiter splitting). Partitions plans into global vs namespaced.
plan/mod.rs — ExtractQuery owns cursor state and generates paginated SQL on demand via composite key DNF clauses. PipelinePlan and TransformOutput unify node and edge ETL into one abstraction.
checkpoint.rs — Checkpoint struct tracking watermark + cursor position, used by ExtractQuery::resume_from to pick up interrupted pagination.
Next MRs
- MR 3: Expand
checkpoint.rswith the ClickHouse persistence store - MR 4: Rewire pipeline and handlers to use the plan module (actual switchover)
Testing
Unit and integration test
Performance Analysis
- This merge request does not introduce any performance regression. If a performance regression is expected, explain why.