feat(cli): persist graph index to DuckDB
What does this MR do and why?
Part of #324 (closed). We now support indexing code graphs locally and querying them, just like you would in the server (via the JSON DSL) but from the terminal.
orbit index ./repo
─────────────────
Tree-sitter Ontology-driven DuckDB
(7 languages) converter (~/.orbit/graph.duckdb)
repo/ AsRecordBatch ┌────────────────┐
├── src/ ──> per entity type ──> │ gl_file │
├── lib/ + Appender API │ gl_definition │
└── ... (bulk insert) │ gl_directory │
│ gl_imported_.. │
│ gl_edge │
└───────┬────────┘
orbit query '<json>' │
──────────────────── │
v
Compile ──> Execute ──> Hydrate ──> Resolve content
(DuckDB (parameterized (read file bytes
dialect) $1,$2,...) from disk)Multiple repos share the same DuckDB file, scoped by a deterministic project_id derived from the repo's canonical path. Re-indexing deletes the old project data first.
Usage
# Index a repo
orbit index /path/to/repo
# Search with file content resolved from disk
orbit query '{"query_type":"search","node":{"id":"f","entity":"File","columns":["id","name","path","content"]},"limit":5}'
# Traversal with byte-range sliced definition content
orbit query '{"query_type":"traversal","nodes":[{"id":"f","entity":"File","columns":["id","path"]},{"id":"d","entity":"Definition","columns":["id","name","content"]}],"relationships":[{"type":"DEFINES","from":"f","to":"d"}],"limit":3}'
# Compile to SQL without executing
orbit compile --local '{"query_type":"search","node":{"id":"f","entity":"File","columns":"*"},"limit":10}'
# Raw JSON output
orbit query --raw '{"query_type":"search","node":{"id":"f","entity":"File","columns":["id","name"]},"limit":3}'Edited by Michael Usachenko