example 1
prompt:
gitlab-org&17514 What is this epic about? What issues are included in this epic? Any recent discussions on this epic? Please provide an extremely detailed report. Include a list of references and links at the bottom of the report.
Detailed Report: Knowledge Graph First Iteration (Epic gitlab-org&17514)
Executive Summary
The Knowledge Graph First Iteration (Epic gitlab-org&17514) is a strategic initiative at GitLab to create a comprehensive, structured, and queryable representation of code repositories. This system aims to provide deep insights into code structure, dependencies, and relationships to enhance GitLab's AI capabilities, developer experience, and code intelligence features. The epic encompasses several child epics and issues focusing on different components of the Knowledge Graph architecture, from parsing to database integration.
Vision and Problem Statement
GitLab's Knowledge Graph project envisions a tool that allows developers to visualize how every file, class, and import in a repository fits together. The project aims to be:
- An open-source, standalone solution working across operating systems
- Deeply integrated with GitLab features (Monolith, Language Server, CLI, AI)
- A visualization tool for repository code and GitLab features
The fundamental problem being addressed is that current AI data sources cannot provide comprehensive context for entire repositories. Basic code chunking and description enrichment are insufficient for:
- Finding test files
- Finding references across functions
- Navigating large, unfamiliar codebases
Solution Architecture
The GitLab Knowledge Graph consists of five key architectural components:
-
Unified Parser Project (
gitlab-code-parser) (Epic gitlab-org&17516)- A high-performance static analysis library built in Rust
- Leverages
tree-sitterandast-grep - Provides consistent parsing across supported languages
- Runs both server-side and client-side via Wasm/FFI
-
Graph Database Technology
- Uses Kuzu, an embeddable graph database
- Accessed via customized database clients
-
Knowledge Graph Core Project (Epic gitlab-org&17517)
- Contains central logic for extracting ASTs via
gitlab-code-parser - Defines graph nodes/edges and matches entities
- Manages data structures for graph construction and querying
- Exposes crates for both server-side and client-side indexing
- Contains central logic for extracting ASTs via
-
Knowledge Graph Server Architecture (Epic gitlab-org&17518 (closed))
- Creates an indexer worker that wraps the core Rust project
- Provides an API service that allows Rails to query graph nodes
-
Client-side Repository Interaction (
gitalisk) (Epic gitlab-org&17515)- Rust-based library for efficient cross-platform
gitoperations - Used for accessing repository data and structure during indexing
- Rust-based library for efficient cross-platform
Child Epics and Their Purpose
The Knowledge Graph initiative is broken down into four main child epics:
-
gitlab-org&17517 (Knowledge Graph Core Indexer Project)
- Implements the core indexing logic in Rust
- Contains several crates including:
- CLI crate for standalone operation
- Core crate with base logic
- Client indexer with performance optimizations
- Database client abstraction
- Handles both initial and incremental indexing workflows
-
gitlab-org&17516 (One Parser - gitlab-code-parser)
- Creates a unified Rust-based static code analysis library
- Provides consistent parsing across languages
- Supports multiple export formats (Rust crate, WASM, FFI)
- Enables cross-platform usage from server to browser
-
gitlab-org&17515 (Client-side Repository Interaction - gitalisk)
- Develops a cross-platform Git operations library
- Provides efficient access to repository data structure
-
gitlab-org&17518 (closed) (Knowledge Graph Server Architecture)
- No detailed information available in the research findings
Key Issues and Current Status
Several key issues are being tracked as part of this initiative:
-
Project Security (gitlab-org/gitlab#540414)
- Focuses on implementing security best practices for the Rust projects
- Currently has implemented protection for
mainbranch and dependency scanning - Several security measures remain to be implemented
-
Rust Crate Publishing (gitlab-org/gitlab#536080 (closed))
- Addresses sharing Rust libraries across Knowledge Graph projects
- Investigating options including Git dependencies, GitLab Cargo Registry, and crates.io
- Aims to enable reproducible builds and cross-team collaboration
-
Go Package POC (gitlab-org/gitlab#536081 (closed))
- Creating bindings for Go applications to use the Rust parser
- Following an approach similar to KuzuDB with pre-compiled binaries
- Currently scheduled for milestone 18.1
-
TypeScript/Javascript Import Handling (gitlab-org/rust/gitlab-code-parser#2 (closed))
- Documents import/require syntax patterns to be supported
- Provides detailed technical specifications for implementation
-
Initial Project Creation (gitlab-org/gitlab#536077 (closed))
- Setting up the foundational Rust workspace
- Creating the basic project structure and tooling
Historical Context and Related Initiatives
The Knowledge Graph First Iteration builds upon previous work:
-
X-Ray Graph Proof of Concept (Epic gitlab-org&16251)
- Initial POC demonstrated value of knowledge graph approach
- Followed a two-track approach:
- Track 1: Value Validation
- Track 2: Technical Foundation
- Focused on server-side repositories and local integrations
-
"Chat with your codebase" Initiative (Epic gitlab-org&16910)
- Enhances GitLab's Duo AI offering
- Allows users to interact with entire codebases through natural language
- Knowledge Graph is mentioned as a post-MVC iteration
-
Use Case Collection (gitlab-org/gitlab#508978 (closed))
- Gathered specific use cases for the Knowledge Graph
- Example use cases included impact analysis, repository navigation, and code suggestions context
Technical Implementation Approach
The implementation approach includes several technical decisions:
-
Parsing Strategy
- Using
tree-sitterandast-grepfor code parsing - AST-grep rules will live within the parser project
- Parser will serve multiple features (Knowledge Graph, Embeddings, IDE features)
- Using
-
Indexing Strategy
- Initial indexing uses KuzuDB's bulk import feature from CSV/Parquet files
- Incremental updates handled differently for client-side and server-side
- Server-side updates create a copy of KuzuDB before making changes
-
Interface Between Components
- Initial approach builds a Rust CLI application
- Server-side Go worker checks out files and executes the Rust CLI
- Alternative approach using FFI under consideration for future
-
Database Querying Approaches
- Server-side: Rails services use an abstraction layer to talk to graph node
- Client-side: Queries made directly through Rust crate or local HTTP server
System Interfaces
The Knowledge Graph will be accessible through multiple interfaces:
-
Standalone CLI
- Allows local indexing and exploration
- Serves as a local query interface/UI
-
Language Server Integration
- Provides real-time querying within IDEs
- Supports navigation and context building for AI features
-
GitLab Server Integration
- Enables advanced code search
- Supports codebase understanding in Duo Chat
- Provides context for Code Suggestions
- Facilitates impact analysis
Recent Discussions and Progress
Recent work has been focused on:
-
Rust Project Security (gitlab-org/gitlab#540414)
- Implementing security best practices for Rust repositories
- Protecting main branches and implementing dependency scanning
-
Inter-Repository Dependency Management (gitlab-org/gitlab#536080 (closed))
- Exploring options for sharing Rust crates across projects
- Evaluating Git dependencies, GitLab Cargo Registry, and crates.io
-
Go Integration (gitlab-org/gitlab#536081 (closed))
- Creating a proof of concept for Go bindings
- Designing a structure for the bindings project
-
Parser Implementation (gitlab-org/rust/gitlab-code-parser#2 (closed))
- Documenting syntax patterns for TypeScript/JavaScript
- Specifying technical implementation details
Conclusion
The Knowledge Graph First Iteration represents a significant strategic investment in GitLab's code intelligence capabilities. By creating a comprehensive representation of code relationships, GitLab aims to enhance AI-powered features, improve developer navigation, and provide deeper insights into codebases. The project is currently in active development with multiple components progressing in parallel, including the core parser, indexer, and integration layers.
References and Links
Epic
- gitlab-org&17514 - Knowledge Graph