[design] defining schema for relationships between definition nodes
(WIP)
Background
As they are currently defined, we have three unique node types, DirectoryNode, FileNode, and DefinitionNode. All three are similar, and we can potentially merge them in a single Node type, and just have sparse columns, as Kuzu is a columnar database, and automatically compresses sparse column segments. But for now, let us assume that we have all three.
To represent relationships between nodes, I propose we have a single relationships table with the following schema, at most, for every node type:
relationships
| Property | Type | Description |
|---|---|---|
source_id |
UINT32 | Outgoing or "sender" node id |
target_id |
UINT32 | Incoming or "receiver" node id |
type |
UINT8 | Relationship type |
Kuzu doesn't have an Enum type, so we'll be stuck something like UINT8 (255 possible relationship types), and converting the integer to a string. Or mapping a string to an integer in the case of using this for agentic RAG. Luckily, if we run out of UINT8 slots, we could always do a schema migration to UINT16.
To note, we could store the below relationships (originally defined in @michaelangeloio's E2E MR) entirely outside of Kuzu, or have a relationship_metadata table if we want to join on relationship strings like MODULE_TO_METHOD too, which helps us avoid using up too much memory when constructing/storing the DB, and still gives us the DX niceties of building queries without needing to write join predicates like ON type=5.
Definition Relationships We Currently Capture
All relationships below are FROM DefinitionNode TO DefinitionNode
Module Relationships
| Relationship Type | Description |
|---|---|
MODULE_TO_CLASS |
Module contains class definition |
MODULE_TO_MODULE |
Module contains/imports another module |
MODULE_TO_METHOD |
Module contains method definition |
MODULE_TO_SINGLETON_METHOD |
Module contains singleton method |
MODULE_TO_CONSTANT |
Module contains constant definition |
MODULE_TO_LAMBDA |
Module contains lambda definition |
MODULE_TO_PROC |
Module contains proc definition |
Class Relationships
| Relationship Type | Description |
|---|---|
CLASS_TO_METHOD |
Class contains method definition |
CLASS_TO_ATTRIBUTE |
Class contains attribute definition |
CLASS_TO_CONSTANT |
Class contains constant definition |
CLASS_INHERITS_FROM |
Class inheritance relationship |
CLASS_TO_SINGLETON_METHOD |
Class contains singleton method |
CLASS_TO_CLASS |
Class contains nested class |
CLASS_TO_LAMBDA |
Class contains lambda definition |
CLASS_TO_PROC |
Class contains proc definition |
Method Relationships
| Relationship Type | Description |
|---|---|
METHOD_CALLS |
Method calls another method |
METHOD_TO_BLOCK |
Method contains block definition |
SINGLETON_METHOD_TO_BLOCK |
Singleton method contains block |