Skip to content

feat(parser): setup initial library

This merge request introduces the initial implementation of the parser-core library, a foundational crate for parsing and analyzing source code.

closes [KG] Create Initial Library Code (#16 - closed)

What does this MR do?

  • Establishes Core Architecture: Sets up the main components for parsing, including language detection, a generic parsing interface, and a rule management system.
  • Introduces Ruby Support: Provides the first language implementation for Ruby, including a tree-sitter-based parser and initial pattern-matching rules. This is so that the subsequent MR feat(ruby): implement ruby definitions (!17 - merged), which builds on this MR, can integrate more easily.
  • Defines the Rule System: Implements a rule system based on ast-grep's YAML format for identifying code constructs like class and method definitions.
  • Adds Match Extraction: Provides a structured way to access match results, including both high-level serializable data (MatchInfo) and direct access to raw AST nodes (MatchWithNodes).
  • Includes Comprehensive Testing: Adds unit and integration tests to validate the parsing and rule-matching pipeline.

Library Overview

The parser-core library is designed to parse source code into an Abstract Syntax Tree (AST) using tree-sitter and then apply ast-grep rules to find patterns.

Architecture


graph TB
    subgraph "Input"
        A[Source Code File]
    end

    subgraph "Parsing Pipeline"
        A --> B{Language Detection};
        B --> C[Generic Parser];
        C --> D[Tree-sitter AST];
    end

    subgraph "Rule Matching"
        D --> E[RuleManager];
        E -- "Executes" --> F(Pattern Matching);
        F -- "Produces" --> G[Match Results];
    end

    subgraph "Post-processing"
        G -- "Are Transformed Into" --> K(Structured Data);
    end

    subgraph "Library Internals"
        H[parser.rs]
        I[rules.rs]
        J[ruby/definitions.rs]
    end

    H -.-> C;
    I -.-> E;
    J -.-> K;

A high-level diagram illustrating the data flow from source code to match results.

Components

File Responsibility
parser.rs Defines supported languages and provides the core parsing logic.
rules.rs Manages rule loading, execution, and processing of match results.
ruby/ruby_ast.rs Contains the language-specific configuration for Ruby.
lib.rs The main library crate, exporting key components and error types.

A more detailed documentation for the library can be found in docs/library_overview.md.

A Note on Analyzer

Note that post-processing of matches is not covered in this implementation, as we've decided to keep language implementations separate for the initial iterations. MR feat(ruby): implement ruby definitions (!17 - merged) covers how we intend to do post-processing for extracting definitions and more.

Edited by Michael Angelo Rivera

Merge request reports

Loading