feat(ts): basic intra-file reference resolution + infra for scope-aware resolution
What does this MR do and why?
This merge request introduces a foundational framework for scope-aware symbol analysis in the TypeScript parser. It builds directly on the recently introduced TypeScriptExpression representation, providing the necessary infrastructure to understand the lexical context of expressions. This is the critical next step towards implementing high-fidelity reference resolution that can handle complex, real-world code patterns like variable shadowing and data-flow-dependent method calls.
The Problem: Expressions Without Context
The previous work introduced TypeScriptExpression to model what an expression does. However, the parser lacked a robust mechanism to understand where that expression lives. To resolve a symbol like myService.getData(), the parser must answer questions like:
- What is
myService? Where was it defined? - What is the current lexical scope? Is
myServiceshadowed by a more local definition? - If
myServicewas assigned the result of another expression, what was that expression's return type?
The existing system could not answer these questions reliably, making accurate reference resolution impossible.
The Solution: A Unified Symbol and Scope Analysis Framework
This MR introduces a new architectural layer to provide this missing context. The solution is composed of three key, interacting components:
-
A Unified
TypeScriptSymbolEnum (crates/parser-core/src/typescript/references/types.rs): A new enum was created to serve as a single, canonical representation for all key symbols in the code: definitions, imports, and references. This abstraction allows all symbols, regardless of their type, to be stored and indexed together in a uniform way. -
A Scope-Aware
ByteRangeSpatialIndex(crates/parser-core/src/typescript/references/lookup.rs): The interval tree, previously used only for expressions, is now used to index allTypeScriptSymbols. This creates a spatial map of the entire codebase's lexical structure. New methods were added to the index to facilitate hierarchical queries:-
find_all_containing(start, end): Finds all scopes that contain a given byte range, ordered from smallest to largest. -
find_immediate_parent(start, end): Finds the single, smallest scope that contains a given range. -
find_immediate_children(start, end): Finds all symbols defined directly within a given scope range. This turns the flat list of symbols into a navigable tree, representing the actual scope hierarchy of the code.
-
-
A High-Level Scope Navigation Toolkit (
crates/parser-core/src/typescript/references/utils.rs): A suite of new utilities was built on top of the symbol index to provide powerful analysis capabilities. While not all of these are fully integrated into the resolver yet, they form the core of the new framework and are heavily tested.-
crawl_scope_enhanced: Allows a traversal to start at a specific expression and walk outwards through the scope hierarchy (e.g., from a method to its class, and then to the module level), providing access to the parent and children of each scope. -
find_shared_scope_expressions: This utility identifies data-flow patterns within a single scope. It groups expressions by their containing scope and then links variable assignments to their subsequent uses. For example, it can programmatically linkx = new Database()to later uses likex.connect().
-
Consequences and Implementation Details
-
Integration into the Analyzer (
analyzer.rs): The mainTypeScriptAnalyzerhas been updated to call the newresolve_referencesfunction, formally integrating reference resolution into its analysis pipeline. The result, aVec<TypeScriptReferenceInfo>, is now part of the finalTypeScriptAnalysisResult. -
A Placeholder Resolver (
resolve.rs): A new module,resolve.rs, now houses the main resolution logic. The initial implementation is intentionally naive, using a simpleHashMapto perform a global, name-based lookup (resolve_simple_call). This serves as a working placeholder that allows the overall system to be wired together, while deferring the use of the more advanced scope-crawling tools. -
Enhanced Assignment Parsing (
expression.rs): The expression parser has been made more sophisticated. It now distinguishes between simple assignments (e.g.,x = y,const z = 10) and assignments that involve function or constructor calls. This separation allows for more direct analysis of simple data transfers. -
Extensive Foundational Testing: A significant portion of this MR is dedicated to adding tests for the new scope navigation infrastructure. The tests in
utils.rsandresolve.rsthoroughly validate the correctness ofcrawl_scope_enhancedandfind_shared_scope_expressionsagainst thecomplex.tsfixture, ensuring the foundation is solid, even though these utilities are not yet fully consumed by the main resolver.
What's Next: The Path to Advanced Reference Resolution
This MR successfully lays the complete architectural groundwork for high-fidelity reference resolution. The current resolver is a simple placeholder, and the next steps will involve replacing it with logic that leverages the new infrastructure.
-
Handling Shadowing: The naive
HashMaplookup will be replaced with a call tocrawl_scope_enhanced. To resolve a symbol, the engine will start at the expression's immediate scope and check for a definition. If not found, it will walk up to the parent scope and repeat the process, correctly handling variable shadowing by design. -
Implementing Data-Flow Analysis: The
find_shared_scope_expressionsutility will be used to track variable types. When resolvingx.connect(), the engine will first find the expression that assigned tox(e.g.,x = new Database()). By resolving the RHS of the assignment, it will determine the type ofxand can then correctly resolve.connect()as a method on theDatabaseclass. -
Resolving
thisand Class Members: The scope-crawling tools will be used to find the containing class for a given expression. This will enable the correct resolution ofthisand provide the set of available members for resolving property and method accesses.
Related Issues
Testing
Performance Analysis
Performance Checklist
-
Have you reviewed your memory allocations to ensure you're optimizing correctly? Are you cloning or copying unnecessary data? -
Have you profiled with cargo benchorcriterionto measure performance impact? -
Are you using zero-copy operations where possible (e.g., &strinstead ofString, slice references)? -
Have you considered using Cow<'_, T>for conditional ownership to avoid unnecessary clones? -
Are iterator chains and lazy evaluation being used effectively instead of intermediate collections? -
Are you reusing allocations where possible (e.g., Vec::clear()and reuse vs new allocation)? -
Have you considered using SmallVecor similar for small, stack-allocated collections? -
Are async operations properly structured to avoid blocking the executor? -
Have you reviewed unsafecode blocks for both safety and performance implications? -
Are you using appropriate data structures (e.g., HashMapvsBTreeMapvsIndexMap)? -
Have you considered compile-time optimizations (e.g., const fn, generics instead of trait objects)? -
Are debug assertions ( debug_assert!) used instead of runtime checks where appropriate?