feat(parser): setup initial library
This merge request introduces the initial implementation of the parser-core library, a foundational crate for parsing and analyzing source code.
closes [KG] Create Initial Library Code (#16 - closed)
What does this MR do?
- Establishes Core Architecture: Sets up the main components for parsing, including language detection, a generic parsing interface, and a rule management system.
-
Introduces Ruby Support: Provides the first language implementation for Ruby, including a
tree-sitter-based parser and initial pattern-matching rules. This is so that the subsequent MR feat(ruby): implement ruby definitions (!17 - merged), which builds on this MR, can integrate more easily. -
Defines the Rule System: Implements a rule system based on
ast-grep's YAML format for identifying code constructs like class and method definitions. -
Adds Match Extraction: Provides a structured way to access match results, including both high-level serializable data (
MatchInfo) and direct access to raw AST nodes (MatchWithNodes). - Includes Comprehensive Testing: Adds unit and integration tests to validate the parsing and rule-matching pipeline.
Library Overview
The parser-core library is designed to parse source code into an Abstract Syntax Tree (AST) using tree-sitter and then apply ast-grep rules to find patterns.
Architecture
graph TB
subgraph "Input"
A[Source Code File]
end
subgraph "Parsing Pipeline"
A --> B{Language Detection};
B --> C[Generic Parser];
C --> D[Tree-sitter AST];
end
subgraph "Rule Matching"
D --> E[RuleManager];
E -- "Executes" --> F(Pattern Matching);
F -- "Produces" --> G[Match Results];
end
subgraph "Post-processing"
G -- "Are Transformed Into" --> K(Structured Data);
end
subgraph "Library Internals"
H[parser.rs]
I[rules.rs]
J[ruby/definitions.rs]
end
H -.-> C;
I -.-> E;
J -.-> K;
A high-level diagram illustrating the data flow from source code to match results.
Components
| File | Responsibility |
|---|---|
parser.rs |
Defines supported languages and provides the core parsing logic. |
rules.rs |
Manages rule loading, execution, and processing of match results. |
ruby/ruby_ast.rs |
Contains the language-specific configuration for Ruby. |
lib.rs |
The main library crate, exporting key components and error types. |
A more detailed documentation for the library can be found in docs/library_overview.md.
A Note on Analyzer
Note that post-processing of matches is not covered in this implementation, as we've decided to keep language implementations separate for the initial iterations. MR feat(ruby): implement ruby definitions (!17 - merged) covers how we intend to do post-processing for extracting definitions and more.