SSA-based generic resolver for reference resolution (#457) · Issues · GitLab.org / orbit / GitLab Knowledge Graph

SSA-based generic resolver for reference resolution

## Problem to Solve Reference resolution — figuring out that `x()` calls `some_function` — is currently implemented per-language. Each language has its own resolver with its own approach to tracking variable bindings: | Language | Approach | Lines | |---|---|---| | Python | Custom symbol table visitor | 2,205 | | Ruby | Expression resolver + scope resolver + type map | 1,726 | | Kotlin | Expression resolver + file context | 2,166 | | Java | Expression resolver + file context | 1,206 | | C#, TS, Rust | Single-pass, limited resolution | ~1,000 | ~8,300 lines of resolution code. But the core question is the same for every language: *given a name at a call site, what definition was most recently bound to that name?* This is a reaching-definition problem. Compiler theory solved it decades ago with SSA (Static Single Assignment). ### SSA in 30 seconds ```python def caller(): x = foo x = bar x() # what does x call? ``` Simple: `x` was last assigned to `bar`, so `x()` calls `bar`. But add a branch: ```python def caller(): if condition: x = foo # ← block A else: x = bar # ← block B x() # ← block C (merge point) ``` Now `x` could be either `foo` or `bar`. SSA handles this with phi nodes: ``` Block A: write("x", foo) → x₁ = foo Block B: write("x", bar) → x₂ = bar Block C: read("x") → x₃ = φ(x₁, x₂) └─ "x is foo or bar — emit edges to both" ``` If both branches write the same value, the phi collapses trivially. The Braun et al. algorithm (CC 2013) builds this on the fly — no pre-computed CFG or dominance frontiers needed. Three operations: `write_variable`, `read_variable`, `seal_block`. ## Proposed Solution One generic SSA resolver parameterized by declarative rules. Per-language differences become configuration: ```rust struct PythonRules; impl HasRules for PythonRules { fn rules() -> ResolutionRules { ResolutionRules { import_strategies: vec![ ScopeFqnWalk, ExplicitImport, WildcardImport, SamePackage, SameFile, GlobalName, ], chain_mode: ChainMode::ValueFlow, receiver: ReceiverMode::Convention { name: "self", skip: true }, ..Default::default() } } } ``` The architecture: ``` CanonicalResult (defs, imports, refs, bindings) │ ▼ Scoped SSA Walker (one block per function/class scope, predecessor edges for scope nesting) │ write_variable / read_variable / seal_block │ ▼ SsaResolver (Braun et al.) phi insertion at join points trivial phi elimination │ ▼ RulesResolver + ImportStrategy chain (ScopeFqnWalk → ExplicitImport → WildcardImport → SamePackage → SameFile → GlobalName) │ ▼ Resolved edges inserted into CodeGraph ``` **Scoped walker** — one SSA block per function/class scope, not one flat block per file. A function's local `x = foo` can't collide with another function's `x = bar`. Predecessor edges connect inner scopes to outer scopes so name lookups walk outward. **Import strategy chain** — resolution tries strategies in order until one succeeds. Each strategy is a function from `(name, context)` to `Option<target_fqn>`. Languages differ only in which strategies they enable and in what order. **`EdgeSource`** — when the call site is at module level (no enclosing function), the source node is the File, producing `File → Definition` edges instead of `Definition → Definition`. ## Acceptance Criteria - [x] SSA engine: `write_variable`, `read_variable`, `seal_block`, phi insertion, trivial phi elimination - [x] Unit tests for SSA engine (single block, predecessor flow, phi at join, trivial collapse, loops, unsealed blocks, nested scopes) - [x] Scoped walker: one block per scope, predecessor edges for nesting, bindings written to correct scope - [x] `ResolutionRules` with `ImportStrategy`, `ChainMode`, `ReceiverMode` - [x] `RulesResolver<R: HasRules>` implementing `ReferenceResolver` - [x] Import strategy chain: ScopeFqnWalk, ExplicitImport, WildcardImport, SamePackage, SameFile, GlobalName - [x] `EdgeSource::File` for module-level calls - [x] Resolution rules for Python, Java, Kotlin - [x] Self-edges preserved for recursive calls, suppressed only for fallback sources

issue