chore(indexing): remove remaining parquet writer hardcoding + fqn cleanup
What does this MR do and why?
The primary goal of this MR was to replace the primitive string representation of Fully Qualified Names (FQNs) with a structured, type-safe enum, and to make the writer portion of the relationship schema more declarative.
The central architectural change was the introduction of the FqnType enum and its integration into the core DefinitionNode struct. This change shifted how FQNs are stored and managed throughout the system.
Previous Model: Premature Stringification
-
Data Structure: In
crates/indexer/src/analysis/types.rs, theDefinitionNodestruct held the FQN as aString:pub fqn: String. -
Process: Each language analyzer (e.g., for Ruby, Python, Java) was responsible for converting its language-specific FQN data structure into a generic
Stringearly in the analysis process. For example,ruby_fqn_to_string(&definition.fqn)was called incrates/indexer/src/analysis/languages/ruby/analyzer.rs. - Drawback: This approach discarded valuable structural information from the language-specific FQN, making it difficult to perform more advanced analysis later in the pipeline. All downstream consumers only had access to a flat string.
New Model: Preserving Structured FQN Data
-
Data Structure: The
DefinitionNode.fqnfield was changed fromStringto the newFqnTypeenum:pub fqn: FqnType. This new enum, defined incrates/indexer/src/analysis/types.rs, is a wrapper for language-specific FQN types:pub enum FqnType { Ruby(RubyFqn), Python(PythonFqn), // ... other languages } -
Process: Language analyzers now construct an
FqnTypevariant instead of aString. The conversion to a string representation is deferred and centralized via astd::fmt::Displayimplementation forFqnType. This trait calls the appropriate language-specific string conversion function on demand. The rich, structured FQN is preserved throughout the analysis and mutation pipeline. String conversion is now an explicit, final step, rather than an immediate, lossy transformation.
The shift to FqnType and a related refactoring of the relationship schema had several consequences across the codebase, simplifying logic and enforcing greater type consistency.
Changes - Language Analyzers
Every language analyzer was modified to align with the new data model. Instead of calling a language-specific fqn_to_string function, they now wrap the native FQN structure in the FqnType enum.
-
Affected Files:
crates/indexer/src/analysis/languages/csharp.rscrates/indexer/src/analysis/languages/java/analyzer.rscrates/indexer/src/analysis/languages/kotlin/analyzer.rscrates/indexer/src/analysis/languages/python/analyzer.rscrates/indexer/src/analysis/languages/ruby/analyzer.rscrates/indexer/src/analysis/languages/rust.rscrates/indexer/src/analysis/languages/typescript.rs
-
Example Change (from
ruby/analyzer.rs):-
Before:
let fqn_string = ruby_fqn_to_string(&definition.fqn); -
After:
let fqn = FqnType::Ruby(definition.fqn.clone());
-
Before:
Changes - Explicit String Conversion in Downstream Consumers
Components that relied on a string representation of an FQN (e.g., for HashMap keys, logging, or comparisons) were updated to explicitly call .to_string() on the FqnType field. This makes the conversion visible and intentional.
-
Affected Components:
ExpressionResolverimplementations,mutation/changes.rs, and test suites. -
Example Change (from
tests.rs):-
Before:
.find(|def| def.fqn == "BaseModel") -
After:
.find(|def| def.fqn.to_string() == "BaseModel")
-
Before:
Changes - Refactoring of the Relationship Schema
In parallel with the FQN changes, the definition of relationships between nodes was made more declarative and centralized.
-
RelationshipKindEnum Movement: TheRelationshipKindenum, which defines the type of a graph edge (e.g.,FileToDefinition), was moved fromcrates/indexer/src/analysis/types.rsto the schema definition crate atcrates/database/src/schema/types.rs. This co-locates the relationship type with other core schema definitions. -
Schema Declaration: The
RelationshipTablestruct incrates/database/src/schema/types.rswas modified. Itsfrom_to_pairsfield was changed from(&'static NodeTable, &'static NodeTable)to(&'static NodeTable, &'static NodeTable, Option<&'static RelationshipKind>). This embeds the relationship kind directly into the schema definition incrates/database/src/schema/init.rs. -
Logic Simplification in
WriterService: Theget_relationships_for_pairfunction incrates/indexer/src/analysis/types.rswas deleted. This function contained a large, hardcodedmatchstatement to determine which relationships to process for a given pair of node tables. TheWriterService(crates/indexer/src/writer.rs) now iterates directly over thefrom_to_pairsin the schema, using the providedRelationshipKindto filter and write the correct set of relationships. This makes the logic data-driven and removes the need for procedural mapping.
Related Issues
Testing
All existing unit and integration tests pass.
Performance Analysis
-
This merge request does not introduce any performance regression. If a performance regression is expected, explain why.