Establish GKG Response Format
# Unified GKG Response Schema - Combined Research
Research from four parallel investigations: frontend response parsing, GKG query engine internals, kuzu output mapping, and two GitLab design snippets. Revised after team discussion (JG, Angelo, Michael) to adopt a graph-native nodes+edges model instead of the earlier tabular-rows-with-column-descriptors approach.
The goal: a single JSON response shape where every query type returns deduplicated nodes and instance-level edges. The frontend gets graph data directly for Three.js and renders stacked per-entity-type tables for tabular display.
---
## Table of contents
1. [Problem statement](#1-problem-statement)
2. [Current architecture](#2-current-architecture)
3. [GKG query engine internals](#3-gkg-query-engine-internals)
4. [Current frontend parsing](#4-current-frontend-parsing)
5. [Industry research](#5-industry-research)
6. [Kuzu output mapping](#6-kuzu-output-mapping)
7. [Design evolution](#7-design-evolution)
8. [Unified response schema](#8-unified-response-schema)
9. [Query and response catalog](#9-complete-query-and-response-catalog)
10. [Frontend rendering](#10-frontend-rendering)
11. [Implementation roadmap](#11-implementation-roadmap)
---
## 1. Problem statement
### Current pain
The GKG server returns flat tabular rows where graph topology is encoded in column naming conventions. The frontend has to reverse-engineer this through heuristics that break when anything changes:
| Problem | Impact |
|---------|--------|
| No metadata envelope | Frontend can't know query type, column types, or counts without inspecting data |
| Inconsistent shapes across 5 query types | Each type needs different parsing logic |
| Internal `_gkg_*` columns leak into output | Frontend filters by prefix, could hide legitimate properties |
| No native graph structure for traversals | `graph_transform.js` (entire file) exists solely to reconstruct nodes/edges from flat rows |
| No column schema | Frontend guesses types from values, generates column headers by replacing `_` with spaces |
| Alias detection is a heuristic | `detectAliases()` scans for `_type` suffix - if convention changes, parsing breaks entirely |
| Dual format handling | Code handles both `_gkg_` and plain prefixed columns with fallback logic |
| Hardcoded neighbor keys | `_gkg_neighbor_id`, `_gkg_neighbor_type`, `_gkg_relationship_type` are baked in |
| No domain info in query results | Domain is always `null` for query-derived nodes, only available from schema |
| Schema-to-query disconnect | Schema defines `label_field`/`primary_key` but query results don't reference them |
### What we want
A single JSON schema that:
1. Any frontend client can parse without knowing GKG internals
2. Returns graph data directly (nodes + edges) for visualization
3. Groups entities by type for stacked table display
4. Strips all internal `_gkg_*` columns
5. Uses one shape for all five query types
6. Is shared between server and frontend so they can't drift
---
## 2. Current architecture
### End-to-end data flow
```
User JSON Query
|
v
Rails REST API (POST /api/v4/orbit/query)
|
v
gRPC Client (Analytics::KnowledgeGraph::GrpcClient)
| bidirectional streaming (for redaction handshake)
v
GKG Server Pipeline (8 stages):
Security -> Compilation -> Execution -> Extraction -> Authorization -> Redaction -> Hydration -> Formatting
|
v
JSON array of flat row objects (current output)
|
v
Rails returns: { result: [...rows], generated_sql: string }
|
v
Frontend graph_transform.js reconstructs graph topology
|
v
Three.js visualization / GlTable display
```
### Pipeline stages
| Stage | Input | Output | Location |
|-------|-------|--------|----------|
| Security | JWT claims | `SecurityContext` (org_id + traversal_paths) | `query_pipeline/service.rs` |
| Compilation | JSON query + ontology | `CompiledQuery` (SQL + HydrationPlan) | `query-engine/lib.rs` |
| Execution | Parameterized SQL | Arrow `RecordBatch`es from ClickHouse | `stages/execution.rs` |
| Extraction | Arrow batches | `QueryResult` (typed rows + dynamic nodes) | `stages/extraction.rs` |
| Authorization | QueryResult + gRPC stream | Per-resource auth decisions from Rails | `stages/authorization.rs` |
| Redaction | Auth decisions | Rows marked authorized/unauthorized | `stages/redaction.rs` |
| Hydration | Redacted result | Properties fetched for dynamic nodes | `stages/hydration.rs` |
| Formatting | Hydrated result | JSON `Value` via `ResultFormatter` | `stages/formatting.rs` |
### Rust types
```rust
// QueryResult (redaction/query_result.rs)
pub struct QueryResult {
rows: Vec<QueryResultRow>,
ctx: ResultContext,
}
// QueryResultRow
pub struct QueryResultRow {
columns: HashMap<String, ColumnValue>,
dynamic_nodes: Vec<NodeRef>, // path nodes or neighbor node
edge_kinds: Vec<String>, // relationship kinds (PathFinding only)
authorized: bool,
}
// ColumnValue
pub enum ColumnValue {
Int64(i64),
String(String),
Null,
}
// NodeRef (populated by hydration)
pub struct NodeRef {
pub id: i64,
pub entity_type: String,
pub properties: HashMap<String, ColumnValue>,
}
// ResultContext (enforce.rs)
pub struct ResultContext {
pub query_type: Option<QueryType>,
nodes: HashMap<String, RedactionNode>, // alias -> {alias, entity_type, pk_column, id_column, type_column}
entity_auth: HashMap<String, EntityAuthConfig>,
}
// CompiledQuery (codegen.rs)
pub struct CompiledQuery {
pub base: ParameterizedQuery, // sql + params + result_context
pub hydration: HydrationPlan, // None | Static | Dynamic
}
// PipelineOutput (types.rs)
pub struct PipelineOutput {
pub formatted_result: Value, // JSON array of result rows
pub generated_sql: Option<String>,
pub row_count: usize,
pub redacted_count: usize,
pub execution_time_ms: f64,
}
```
### gRPC Response (Current)
```protobuf
message ExecuteQueryResult {
string result_json = 1; // JSON array
string generated_sql = 2;
int32 row_count = 3;
int32 redacted_count = 4;
double execution_time_ms = 5;
}
```
---
## 3. GKG query engine internals
### 5 query types and their current output shapes
#### Search (single-node lookup)
```json
[
{
"u_username": "admin",
"u_state": "active",
"u_id": 42,
"u_type": "User"
}
]
```
- No JOINs, single entity type
- `{alias}_{property}` naming, `{alias}_id`/`{alias}_type` metadata
#### Traversal (multi-node with relationships)
```json
[
{
"u_username": "admin",
"u_id": 42,
"u_type": "User",
"p_name": "GitLab",
"p_id": 100,
"p_type": "Project",
"e0_type": "MEMBER_OF",
"e0_src": 42,
"e0_dst": 100,
"e0_src_type": "User",
"e0_dst_type": "Project",
"e0_path": "1/2/"
}
]
```
- Multiple entity aliases in same row, edge columns with `e{N}_` prefix
- Multi-hop produces UNION ALL subqueries
#### Aggregation (GROUP BY + aggregate functions)
```json
[
{
"u_username": "admin",
"u_id": 42,
"u_type": "User",
"note_count": 15
}
]
```
- Only group_by nodes get ID/type columns
- Aggregate result columns at top level
#### PathFinding (recursive CTE)
```json
[
{
"path": [
{"id": 100, "entity_type": "Project", "name": "GitLab"},
{"id": 42, "entity_type": "MergeRequest", "title": "Fix bug"},
{"id": 200, "entity_type": "Project", "name": "Other"}
],
"edges": ["IN_PROJECT", "HAS_NOTE"],
"depth": 2
}
]
```
- Already has structured output (path array, edges array)
- Dynamic hydration fills in `NodeRef.properties`
#### Neighbors (direct neighbors)
```json
[
{
"_gkg_neighbor_id": 42,
"_gkg_neighbor_type": "MergeRequest",
"_gkg_relationship_type": "AUTHORED",
"title": "Fix bug",
"iid": 123
}
]
```
- Internal `_gkg_*` columns leak into output
- Neighbor properties merged as top-level keys (no prefix)
### Compilation pipeline
```
JSON -> Schema Validate -> Parse -> Validate -> Normalize -> Lower -> Enforce Return -> Security -> Check -> Codegen -> SQL
```
Two stages matter for the unified schema:
- **Enforce Return** (`enforce.rs`): adds mandatory `_gkg_{alias}_id` and `_gkg_{alias}_type` columns for redaction. This is where `ResultContext` gets built.
- **Codegen** (`codegen.rs`): builds `HydrationPlan` (None for aggregation, Dynamic for pathfinding/neighbors)
### Hydration
- **Static** (not yet active): Pre-compiled templates for Traversal/Search
- **Dynamic** (PathFinding, Neighbors): Entity types discovered at runtime from `dynamic_nodes`, properties fetched via additional search queries
### Formatters
Two formatters exist:
- `RawRowFormatter` - Dashboard/API consumers (flat JSON rows)
- `ContextEngineFormatter` - LLM format (currently delegates to raw, GOON format TBD)
The `GraphFormatter` will replace `RawRowFormatter`.
### File paths (query engine)
| File | Purpose |
|------|---------|
| `crates/query-engine/src/input.rs` | Input types, QueryType enum, filters |
| `crates/query-engine/src/lib.rs` | `compile()` orchestration |
| `crates/query-engine/src/enforce.rs` | ResultContext, RedactionNode, mandatory column injection |
| `crates/query-engine/src/codegen.rs` | SQL generation, CompiledQuery, HydrationPlan |
| `crates/query-engine/src/lower.rs` | Input -> AST for all query types |
| `crates/query-engine/src/constants.rs` | `_gkg_` column naming helpers |
| `crates/gkg-server/src/query_pipeline/formatter.rs` | `row_to_json`, `ResultFormatter` trait |
| `crates/gkg-server/src/query_pipeline/service.rs` | Pipeline orchestration |
| `crates/gkg-server/src/query_pipeline/types.rs` | PipelineOutput and stage I/O types |
| `crates/gkg-server/src/query_pipeline/stages/hydration.rs` | Property fetching |
| `crates/gkg-server/src/redaction/query_result.rs` | QueryResult, QueryResultRow, NodeRef, ColumnValue |
---
## 4. Current frontend parsing
### File inventory (32 files in `ee/app/assets/javascripts/orbit/`)
- **15 Vue components**: app, graph_explorer, schema_page, configuration_page, onboarding, graph_canvas, query_results_table, explorer_query_panel, explorer_hero_banner, explorer_node_sidebar, node_detail_overlay, schema_domain_sidebar, schema_node_card, schema_edge_card, schema_detail_panel
- **12 utils**: graph_transform, node_style_map, three_graph, three_nodes, three_edges, three_scene, three_interaction, three_labels, three_globe, graph_layout, graph_shaders, csv_export, orbit_theme
- **1 API module**: orbit_api.js
- **2 GraphQL**: owned_namespaces query, orbit_update mutation
### The transform pipeline (`graph_transform.js`) -- going away
This file exists only because the server returns flat rows:
1. **`detectAliases(firstRow)`**: Scans for `_gkg_{alias}_type` or `{alias}_type` patterns
2. **`buildNodeFromAlias(row, alias)`**: Extracts entity type, ID, properties by prefix
3. **`extractPrefixedProperties(row, alias)`**: Strips `{alias}_` prefix from all matching keys
4. **Edge inference**: Adjacent aliases in same row become edges with `type: 'related'`
### Internal node shape (what Three.js consumes)
```javascript
{
id: 'User_42', // {EntityType}_{entityId}
label: 'jdoe', // from name || title || username || full_path || id
type: 'user', // lowercase entity type
domain: null, // null for query results, only set from schema
properties: {
id: 42,
username: 'jdoe',
// ...all extracted properties with prefix stripped
}
}
```
### Internal edge shape
```javascript
{
source: 0, // index into nodes array
target: 1, // index into nodes array
type: 'related' // or actual relationship type for neighbors
}
```
### 11 hardcoded assumptions
1. Alias detection via `_type` suffix convention
2. `_gkg_` prefix string
3. Neighbor fixed keys: `_gkg_neighbor_id`, `_gkg_neighbor_type`, `_gkg_relationship_type`
4. Neighbor property prefix always `n`
5. Label priority: `name || title || username || full_path || id`
6. Table filters `_`-prefixed columns
7. Row click ID matching: tries `_gkg_u_id` then `_gkg_p_id` then `id`
8. Schema node fields: `name`, `domain`, `description`, `primary_key`, `label_field`, `style.{color,size}`, `properties[].{name,data_type,nullable}`
9. Schema edge fields: `name`, `description`, `variants[].{source_type,target_type}`
10. Entity type color/name maps for ~25 known types
11. Graph ID format: `{EntityType}_{entityId}` by concatenation
---
## 5. Industry research (8 graph databases)
### Comparative analysis
| Feature | Neo4j | Kuzu | ArangoDB | TigerGraph | Dgraph | AGE |
|---------|-------|------|----------|------------|--------|-----|
| Tabular framing | fields+values | rows (keyed) | flat array | named result sets | tree | SQL rows |
| Type metadata | Typed JSON / meta | dataTypes map | none | v_type/e_type | none | ::vertex/::edge |
| Graph sidecar | Legacy only (deprecated) | No | No | No | No | No |
| Identity vs data | row+meta split | inline | inline | separate v_id+attrs | inline uid | inline |
| Streaming | Jolt (NDJSON) | No | cursor batching | No | No | No |
### Neo4j deep dive: graph sidecar is deprecated
Neo4j's graph sidecar (via `resultDataContents: ["row", "graph"]`) only exists on the deprecated HTTP API (`/db/<dbname>/tx/commit`).
1. Properties are fully duplicated between `row` and `graph` sections. No reference/pointer mechanism. The `AggregatingWriter` calls both `RowWriter` and `GraphExtractionWriter` independently against the same record.
2. The new Query API v2 (`/db/<dbname>/query/v2`, Neo4j 5.19) dropped `resultDataContents` entirely. It returns row-based data with structured inline objects only.
3. Bolt protocol (used by official drivers, Neo4j Browser, Bloom) has no graph sidecar either. Graph extraction happens client-side via `extractNodesAndRelationshipsFromRecords()`.
4. Every database we looked at converges on the same pattern: return structured inline objects in rows, let the client extract graph entities.
### Takeaways
1. Graph sidecars duplicate data. Neo4j's is deprecated and copies every property twice.
2. Structured inline objects in rows is where everyone landed (Neo4j v2, Kuzu, all others).
3. TigerGraph has the cleanest identity/data separation: `{ v_id, v_type, attributes: {} }`.
4. Kuzu's `dataTypes` map is worth stealing -- column-level type metadata alongside results.
5. ArangoDB paths use `{ vertices: [], edges: [] }`, similar to our PathFinding.
6. All databases use the same envelope for aggregated and non-aggregated queries. Cell content changes, not structure.
### Patterns to adopt
| Pattern | Source | Benefit |
|---------|--------|---------|
| Structured inline objects in rows (no sidecar) | Neo4j v2, Kuzu, Bolt | No duplication, single source of truth |
| Column-level type metadata (`dataTypes` / `schema`) | Kuzu + Neo4j Typed JSON | Frontend knows column types, can dispatch extraction |
| Identity/data separation (`v_id` + `attributes`) | TigerGraph | Clean, no mixing of system and user fields |
| String composite IDs (`"user:42"`) | SurrealDB pattern | Frontend uniqueness without concatenation hacks |
| Display hint (table vs graph) | Novel | Frontend knows natural visualization mode |
| Schema-driven extraction | Kuzu Explorer | Frontend dispatches by `schema.columns[].type`, not naming conventions |
---
## 6. Kuzu output mapping
### Wire format
```json
{
"rows": [/* row objects keyed by column name */],
"dataTypes": { "columnName": "TYPE_STRING" },
"isSchemaChanged": false,
"isMultiStatement": false
}
```
### Node shape
```json
{
"_label": "person",
"_id": { "offset": 0, "table": 0 },
"name": "Alice",
"age": 35
}
```
System properties (`_label`, `_id`) are peers with user properties at same level.
### Relationship shape
```json
{
"_src": { "offset": 3, "table": 0 },
"_dst": { "offset": 0, "table": 1 },
"_label": "workAt",
"_id": { "offset": 2, "table": 4 },
"year": 2010,
"rating": 7.6
}
```
### Path (recursive rel) shape
```json
{
"_nodes": [/* intermediate NodeValue objects */],
"_rels": [/* all RelValue objects in path */]
}
```
### Explorer graph extraction pipeline
1. Server returns `{ rows, dataTypes }`
2. Frontend `extractGraphFromQueryResult()` iterates rows, dispatches by `dataTypes[column]`:
- `NODE` -> `processNode()` -> G6 node `{id, data: {properties}, style: {...}}`
- `REL` -> `processRel()` -> G6 edge `{id, source, target, data: {properties}}`
- `RECURSIVE_REL` -> extracts `_nodes` and `_rels`, processes each
- Scalars ignored in graph view (shown only in table/JSON)
3. Deduplication by encoded ID (`"table_offset"` string)
4. Performance cap at 500 nodes (configurable)
### Schema (separate endpoint)
```json
{
"nodeTables": [
{ "name": "person", "properties": [{ "name": "ID", "type": "INT64", "isPrimaryKey": true }] }
],
"relTables": [
{ "name": "workAt", "properties": [...], "connectivity": [{ "src": "person", "dst": "organisation" }] }
]
}
```
### What to steal
- `dataTypes` metadata map alongside results (the frontend needs to know column types)
- Underscore-prefixed system properties (`_label`, `_id`) separate from user data
- Schema as a separate endpoint (maps to our existing `get_ontology` gRPC call)
- Type dispatch in the extraction pipeline (dispatch by column type, not naming convention)
### What to skip
- Internal IDs (`{offset, table}`) -- use domain-meaningful string IDs instead
- No query metadata (we need execution plans, SQL, provenance)
- No pagination on wire (kuzu returns everything at once)
- No redaction markers (not applicable to kuzu)
---
## 7. Design evolution
### Round 1: graph sidecar (rejected)
**Snippet 5965027 (Michael U.)** proposed `{ metadata, rows, graph }` with a graph sidecar of deduplicated `nodes[]` + `edges[]`. **Snippet 5965036 (Angelo)** extended it with column descriptors, property types, `displayLabel`, edge weights, and a JSON Schema (draft-07).
Both included a graph sidecar alongside tabular rows. After investigating Neo4j's implementation (section 5), we dropped the sidecar:
1. Neo4j's sidecar fully duplicates all property data (no reference mechanism)
2. Neo4j dropped it in Query API v2 (5.19+)
3. Every other database (Kuzu, ArangoDB, TigerGraph, Memgraph, SurrealDB) returns rows only
4. Neo4j Browser, Bloom, and Kuzu Explorer all do graph extraction client-side
### Round 2: tabular rows with column descriptors (superseded)
Next iteration used `{ metadata, columns, edges, rows }` where each row was a map of alias keys to node objects. Column descriptors told the frontend each column's type, and edge specs described column-to-column relationships. The frontend extracted graph topology by iterating rows and pairing node values with edge specs.
This worked but had two problems:
1. **PathFinding and Neighbors needed special row shapes.** Paths are variable-length sequences that don't fit fixed columns. Neighbors have dynamic entity types. Both required per-query-type dispatch in the consumer.
2. **Data duplication in rows.** A user appearing in 50 traversal rows had their properties repeated 50 times.
### Round 3: nodes + edges (current design)
After team discussion (JG, Angelo), we landed on a graph-native model: the server returns deduplicated nodes and instance-level edges. No rows, no column descriptors for graph data.
JG's key observation: for table display, group entities by type and show each type in its own table, stacked. The edges let you build graph visualizations directly. The response shape stays the same across all query types.
What we keep from earlier rounds:
- Proto-level metadata: query_type, row_count, raw_query_strings
- Cached ontology for display info (labels, domains, styles, property types)
- `ResultContext` extension with relationship refs
- `ResultFormatter` trait for pluggable formatting
---
## 8. Unified response schema
### Two layers
**Layer 1: Ontology (cached, fetched once via `GET /api/v4/orbit/schema`)**
The ontology holds entity types, property definitions, label fields, domains, styles, and edge definitions. It changes when the schema updates, not per query. GKG already has this infrastructure and the frontend already calls it on load to build `nodeStyleMap`.
```json
{
"schema_version": "1.0.0",
"domains": [
{ "name": "core", "node_names": ["User", "Project", "Group"] },
{ "name": "code_review", "node_names": ["MergeRequest", "Note"] }
],
"nodes": [
{
"name": "User",
"domain": "core",
"description": "A GitLab user account",
"primary_key": "id",
"label_field": "username",
"style": { "color": "#10B981", "size": 32 },
"default_columns": ["id", "username", "name", "state"],
"properties": [
{ "name": "id", "data_type": "Int64", "nullable": false },
{ "name": "username", "data_type": "String", "nullable": true },
{ "name": "name", "data_type": "String", "nullable": true },
{ "name": "email", "data_type": "String", "nullable": false },
{ "name": "state", "data_type": "String", "nullable": true },
{ "name": "created_at", "data_type": "Timestamp", "nullable": true }
]
}
],
"edges": [
{
"name": "AUTHORED",
"description": "User authored a merge request",
"variants": [
{ "source_type": "User", "target_type": "MergeRequest" },
{ "source_type": "User", "target_type": "Note" }
]
}
]
}
```
The frontend caches this and builds lookup maps:
- `ontology.nodes["User"].label_field` -> `"username"`
- `ontology.nodes["User"].domain` -> `"core"`
- `ontology.nodes["User"].style` -> `{ color: "#10B981", size: 32 }`
- `ontology.nodes["User"].properties` -> full type info for table column formatting
**Layer 2: Query response (per-request)**
The response splits into two parts: proto-level metadata (typed fields on the gRPC message) and a JSON payload (the `result_json` string containing nodes + edges).
**Proto envelope** (decided in sync, see [MR 411 note](https://gitlab.com/gitlab-org/orbit/knowledge-graph/-/merge_requests/411#note_3132681793)):
```protobuf
message ExecuteQueryResult {
string result_json = 1; // JSON string: { columns, nodes, edges }
string query_type = 2; // "search", "traversal", etc.
repeated string raw_query_strings = 3; // SQL strings through pipeline stages
int32 row_count = 4; // total rows returned
}
```
Decisions from sync:
- **Metadata lives in proto, not in the JSON string.** The Rails gRPC client reads typed fields, not embedded JSON metadata.
- **No `redacted_count`** in any response (proto or JSON). Redaction is invisible to the consumer.
- **No pipeline timing** (`execution_time_ms`) in the response.
- **`raw_query_strings`** replaces the single `generated_sql` field. Array of SQL strings from compilation stages (useful for debugging multi-stage queries like UNION ALL).
**JSON payload** (`result_json`):
Every query returns the same shape: `columns` (for aggregation computed values), `nodes` (deduplicated entity objects), and `edges` (instance-level relationships).
```json
{
"columns": [],
"nodes": [
{ "type": "User", "id": 1, "username": "alice", "name": "Alice", "state": "active" },
{ "type": "User", "id": 2, "username": "bob", "name": "Bob", "state": "active" },
{ "type": "Project", "id": 101, "name": "Alpha", "full_path": "gitlab-org/alpha" },
{ "type": "Project", "id": 102, "name": "Beta", "full_path": "gitlab-org/beta" }
],
"edges": [
{ "from": "User", "from_id": 1, "to": "Project", "to_id": 101, "type": "MEMBER_OF" },
{ "from": "User", "from_id": 1, "to": "Project", "to_id": 102, "type": "MEMBER_OF" },
{ "from": "User", "from_id": 2, "to": "Project", "to_id": 101, "type": "MEMBER_OF" }
]
}
```
No `label_field`, no `domain`, no `style`. The frontend looks up `"User"` in the cached ontology for all of that. No metadata envelope inside the JSON -- that all lives in proto fields.
### Design principles
1. **Nodes are deduplicated** -- each entity appears once, even if it participates in many edges
2. **Edges are instance-level** -- each edge connects two specific nodes by their type and ID
3. **One shape for all query types** -- `columns` + `nodes` + `edges` in JSON; `query_type` + `row_count` + `raw_query_strings` in proto
4. **No internal columns leak** -- the formatter strips all `_gkg_*` prefixes
5. **Metadata in proto, data in JSON** -- typed fields for query metadata, JSON string for graph data
6. **No redaction info exposed** -- redaction is invisible to the consumer
7. **Ontology is cached** -- display metadata (labels, domains, styles, property types) comes from the schema, not the response
8. **`id` and `type` always included on nodes** -- graph identity needs both, even if the user didn't explicitly select `id`
### Node shape
A flat object with `type`, `id`, and entity properties:
```json
{ "type": "User", "id": 42, "username": "alice", "name": "Alice", "state": "active" }
```
The frontend constructs composite IDs (`"User:42"`) for deduplication and graph rendering. Property names match the ontology definitions, so the frontend knows their data types from the cached schema.
### Edge shape
Two nodes connected by type and ID:
```json
{ "from": "User", "from_id": 1, "to": "Project", "to_id": 101, "type": "MEMBER_OF" }
```
Optional properties:
- `depth` -- for variable-length traversals, how many hops this edge represents
- `path_id` + `step` -- for path finding, which path this edge belongs to and its position
### Columns array
Empty for most query types. Used only for aggregation to describe computed values that are attached to nodes:
```json
[
{ "name": "mr_count", "type": "Int64", "aggregation": "count" },
{ "name": "avg_age", "type": "Float64", "aggregation": "avg" }
]
```
When `columns` is non-empty, the listed properties appear on nodes but aren't real entity properties. The frontend knows: anything in `columns` with an `aggregation` field is query-derived. Everything else on the node comes from the entity definition.
### Display hint
The frontend derives the default visualization mode from `query_type` (available as a proto field):
| Query type | Default view | Rationale |
|------------|-------------|-----------|
| `search`, `aggregation` | Table | Single entity type or scalar metrics -- tabular is natural |
| `traversal`, `path_finding`, `neighbors` | Graph | Multiple entity types with relationships -- graph is natural |
The frontend can always switch between table and graph regardless of the default.
---
## 9. Complete query and response catalog
Every query type with the input JSON and the `result_json` payload. Proto-level metadata (`query_type`, `row_count`, `raw_query_strings`) travels in typed proto fields outside this JSON. All payloads use the same `{ columns, nodes, edges }` shape.
### 9.1 Search
Single entity lookup with optional filters.
**Query:**
```json
{
"query_type": "search",
"node": { "id": "u", "entity": "User", "columns": "*", "filters": { "state": { "op": "eq", "value": "active" } } },
"limit": 20
}
```
**Response:**
```json
{
"columns": [],
"nodes": [
{ "type": "User", "id": 1, "username": "alice", "name": "Alice", "email": "alice@example.com", "state": "active", "created_at": "2024-01-10T08:00:00Z" },
{ "type": "User", "id": 2, "username": "bob", "name": "Bob", "email": "bob@example.com", "state": "active", "created_at": "2024-03-15T12:00:00Z" },
{ "type": "User", "id": 5, "username": "carol", "name": "Carol", "email": "carol@example.com", "state": "active", "created_at": "2025-06-01T09:30:00Z" }
],
"edges": []
}
```
Table: one Users table. Graph: just nodes, no edges. Frontend looks up `ontology.nodes["User"]` for label_field, domain, style.
### 9.2 Traversal (single-hop, 2 nodes)
**Query:**
```json
{
"query_type": "traversal",
"nodes": [
{ "id": "u", "entity": "User" },
{ "id": "p", "entity": "Project" }
],
"relationships": [
{ "types": ["MEMBER_OF"], "from": "u", "to": "p" }
],
"limit": 25
}
```
**Response:**
```json
{
"columns": [],
"nodes": [
{ "type": "User", "id": 1, "username": "alice", "name": "Alice", "state": "active" },
{ "type": "User", "id": 2, "username": "bob", "name": "Bob", "state": "active" },
{ "type": "Project", "id": 101, "name": "Alpha", "full_path": "gitlab-org/alpha", "visibility_level": "internal" },
{ "type": "Project", "id": 102, "name": "Beta", "full_path": "gitlab-org/beta", "visibility_level": "public" }
],
"edges": [
{ "from": "User", "from_id": 1, "to": "Project", "to_id": 101, "type": "MEMBER_OF" },
{ "from": "User", "from_id": 1, "to": "Project", "to_id": 102, "type": "MEMBER_OF" },
{ "from": "User", "from_id": 2, "to": "Project", "to_id": 101, "type": "MEMBER_OF" }
]
}
```
Table: Users table stacked above Projects table. Graph: 4 nodes, 3 edges, direct to Three.js. User:1 appears once despite having 2 edges -- no data duplication.
### 9.3 Traversal (3 nodes, chained)
**Query:**
```json
{
"query_type": "traversal",
"nodes": [
{ "id": "u", "entity": "User" },
{ "id": "n", "entity": "Note" },
{ "id": "p", "entity": "Project" }
],
"relationships": [
{ "types": ["AUTHORED"], "from": "u", "to": "n" },
{ "types": ["CONTAINS"], "from": "p", "to": "n" }
],
"limit": 20
}
```
**Response:**
```json
{
"columns": [],
"nodes": [
{ "type": "User", "id": 1, "username": "alice", "name": "Alice", "state": "active" },
{ "type": "Note", "id": 50, "title": "Review note" },
{ "type": "Note", "id": 51, "title": "Bug report" },
{ "type": "Project", "id": 101, "name": "Alpha", "full_path": "gitlab-org/alpha", "visibility_level": "internal" },
{ "type": "Project", "id": 102, "name": "Beta", "full_path": "gitlab-org/beta", "visibility_level": "public" }
],
"edges": [
{ "from": "User", "from_id": 1, "to": "Note", "to_id": 50, "type": "AUTHORED" },
{ "from": "User", "from_id": 1, "to": "Note", "to_id": 51, "type": "AUTHORED" },
{ "from": "Project", "from_id": 101, "to": "Note", "to_id": 50, "type": "CONTAINS" },
{ "from": "Project", "from_id": 102, "to": "Note", "to_id": 51, "type": "CONTAINS" }
]
}
```
Correct topology: `User:1 -> Note:50`, `Project:101 -> Note:50`, `User:1 -> Note:51`, `Project:102 -> Note:51`. Edge direction comes from the ontology, not query ordering.
### 9.4 Traversal (star pattern)
One node connected to multiple others via different relationships.
**Query:**
```json
{
"query_type": "traversal",
"nodes": [
{ "id": "u", "entity": "User" },
{ "id": "mr", "entity": "MergeRequest" },
{ "id": "p", "entity": "Project" }
],
"relationships": [
{ "types": ["AUTHORED"], "from": "u", "to": "mr" },
{ "types": ["MEMBER_OF"], "from": "u", "to": "p" }
],
"limit": 20
}
```
**Response:**
```json
{
"columns": [],
"nodes": [
{ "type": "User", "id": 1, "username": "alice", "name": "Alice", "state": "active" },
{ "type": "MergeRequest", "id": 42, "iid": 5, "title": "Fix bug", "state": "merged" },
{ "type": "MergeRequest", "id": 43, "iid": 6, "title": "Add feature", "state": "opened" },
{ "type": "Project", "id": 101, "name": "Alpha", "full_path": "gitlab-org/alpha", "visibility_level": "internal" }
],
"edges": [
{ "from": "User", "from_id": 1, "to": "MergeRequest", "to_id": 42, "type": "AUTHORED" },
{ "from": "User", "from_id": 1, "to": "MergeRequest", "to_id": 43, "type": "AUTHORED" },
{ "from": "User", "from_id": 1, "to": "Project", "to_id": 101, "type": "MEMBER_OF" }
]
}
```
Star topology: User:1 is the hub. Two AUTHORED edges, one MEMBER_OF edge. Not a chain.
### 9.5 Traversal (variable-length, max_hops > 1)
UNION ALL unrolling. Only endpoints returned, no intermediate nodes.
**Query:**
```json
{
"query_type": "traversal",
"nodes": [
{ "id": "u", "entity": "User" },
{ "id": "p", "entity": "Project" }
],
"relationships": [
{ "types": ["MEMBER_OF"], "from": "u", "to": "p", "min_hops": 1, "max_hops": 3 }
],
"limit": 25
}
```
**Response:**
```json
{
"columns": [],
"nodes": [
{ "type": "User", "id": 1, "username": "alice", "name": "Alice", "state": "active" },
{ "type": "Project", "id": 101, "name": "Alpha", "full_path": "gitlab-org/alpha", "visibility_level": "internal" },
{ "type": "Project", "id": 102, "name": "Beta", "full_path": "gitlab-org/beta", "visibility_level": "public" },
{ "type": "Project", "id": 103, "name": "Gamma", "full_path": "gitlab-org/gamma", "visibility_level": "private" }
],
"edges": [
{ "from": "User", "from_id": 1, "to": "Project", "to_id": 101, "type": "MEMBER_OF", "depth": 1 },
{ "from": "User", "from_id": 1, "to": "Project", "to_id": 102, "type": "MEMBER_OF", "depth": 2 },
{ "from": "User", "from_id": 1, "to": "Project", "to_id": 103, "type": "MEMBER_OF", "depth": 3 }
]
}
```
`depth` on the edge. Direct connections have `depth: 1`. The codegen produces one UNION ALL subquery per hop level, but the formatter just processes them as regular rows and deduplicates nodes.
### 9.6 Traversal (mixed single + variable-length)
**Query:**
```json
{
"query_type": "traversal",
"nodes": [
{ "id": "u", "entity": "User" },
{ "id": "n", "entity": "Note" },
{ "id": "p", "entity": "Project" }
],
"relationships": [
{ "types": ["AUTHORED"], "from": "u", "to": "n" },
{ "types": ["CONTAINS"], "from": "p", "to": "n", "min_hops": 1, "max_hops": 2 }
],
"limit": 20
}
```
**Response:**
```json
{
"columns": [],
"nodes": [
{ "type": "User", "id": 1, "username": "alice", "name": "Alice", "state": "active" },
{ "type": "Note", "id": 50, "title": "Review" },
{ "type": "Note", "id": 51, "title": "Report" },
{ "type": "Project", "id": 101, "name": "Alpha", "full_path": "gitlab-org/alpha", "visibility_level": "internal" },
{ "type": "Project", "id": 102, "name": "Beta", "full_path": "gitlab-org/beta", "visibility_level": "public" }
],
"edges": [
{ "from": "User", "from_id": 1, "to": "Note", "to_id": 50, "type": "AUTHORED" },
{ "from": "User", "from_id": 1, "to": "Note", "to_id": 51, "type": "AUTHORED" },
{ "from": "Project", "from_id": 101, "to": "Note", "to_id": 50, "type": "CONTAINS", "depth": 1 },
{ "from": "Project", "from_id": 102, "to": "Note", "to_id": 51, "type": "CONTAINS", "depth": 2 }
]
}
```
Single-hop edges (AUTHORED) have no `depth`. Variable-length edges (CONTAINS) carry `depth`. No `_depth_1` naming convention -- it's a property on the specific edge.
### 9.7 Traversal (wide query, 4 entities)
**Query:**
```json
{
"query_type": "traversal",
"nodes": [
{ "id": "u", "entity": "User" },
{ "id": "mr", "entity": "MergeRequest" },
{ "id": "p", "entity": "Project" },
{ "id": "f", "entity": "File" }
],
"relationships": [
{ "types": ["AUTHORED"], "from": "u", "to": "mr" },
{ "types": ["IN_PROJECT"], "from": "mr", "to": "p" },
{ "types": ["CONTAINS"], "from": "p", "to": "f" }
],
"limit": 10
}
```
**Response:**
```json
{
"columns": [],
"nodes": [
{ "type": "User", "id": 1, "username": "alice", "name": "Alice", "state": "active" },
{ "type": "MergeRequest", "id": 42, "iid": 5, "title": "Fix login bug", "state": "merged" },
{ "type": "Project", "id": 101, "name": "Alpha", "full_path": "gitlab-org/alpha", "visibility_level": "internal" },
{ "type": "File", "id": 500, "path": "app/controllers/sessions_controller.rb", "name": "sessions_controller.rb", "language": "ruby" }
],
"edges": [
{ "from": "User", "from_id": 1, "to": "MergeRequest", "to_id": 42, "type": "AUTHORED" },
{ "from": "MergeRequest", "from_id": 42, "to": "Project", "to_id": 101, "type": "IN_PROJECT" },
{ "from": "Project", "from_id": 101, "to": "File", "to_id": 500, "type": "CONTAINS" }
]
}
```
Table: four entity tables stacked (User, MR, Project, File), edge tables between them. Graph: 4 nodes, 3 edges.
### 9.8 Aggregation (COUNT with GROUP BY)
**Query:**
```json
{
"query_type": "aggregation",
"nodes": [
{ "id": "u", "entity": "User" },
{ "id": "mr", "entity": "MergeRequest" }
],
"relationships": [
{ "types": ["AUTHORED"], "from": "u", "to": "mr" }
],
"aggregations": [
{ "function": "count", "target": "mr", "group_by": "u", "alias": "mr_count" }
],
"aggregation_sort": { "agg_index": 0, "direction": "desc" },
"limit": 10
}
```
**Response:**
```json
{
"columns": [
{ "name": "mr_count", "type": "Int64", "aggregation": "count" }
],
"nodes": [
{ "type": "User", "id": 1, "username": "alice", "name": "Alice", "state": "active", "mr_count": 42 },
{ "type": "User", "id": 2, "username": "bob", "name": "Bob", "state": "active", "mr_count": 15 },
{ "type": "User", "id": 3, "username": "carol", "name": "Carol", "state": "active", "mr_count": 8 }
],
"edges": []
}
```
Computed values are inlined on the group-by node. `columns` lists what's computed so the frontend can distinguish them from real entity properties. The Users table shows username, name, state, mr_count -- the frontend knows from the ontology which columns are entity properties and from `columns` which are aggregates.
### 9.9 Aggregation (multiple functions)
**Query:**
```json
{
"query_type": "aggregation",
"nodes": [
{ "id": "p", "entity": "Project" },
{ "id": "mr", "entity": "MergeRequest" }
],
"relationships": [
{ "types": ["IN_PROJECT"], "from": "mr", "to": "p" }
],
"aggregations": [
{ "function": "count", "target": "mr", "group_by": "p", "alias": "mr_count" },
{ "function": "avg", "target": "mr", "group_by": "p", "alias": "avg_mr" },
{ "function": "max", "target": "mr", "group_by": "p", "alias": "max_mr_id" }
],
"limit": 10
}
```
**Response:**
```json
{
"columns": [
{ "name": "mr_count", "type": "Int64", "aggregation": "count" },
{ "name": "avg_mr", "type": "Float64", "aggregation": "avg" },
{ "name": "max_mr_id", "type": "Int64", "aggregation": "max" }
],
"nodes": [
{ "type": "Project", "id": 101, "name": "Alpha", "full_path": "gitlab-org/alpha", "visibility_level": "internal", "mr_count": 15, "avg_mr": 42.7, "max_mr_id": 99 },
{ "type": "Project", "id": 102, "name": "Beta", "full_path": "gitlab-org/beta", "visibility_level": "public", "mr_count": 8, "avg_mr": 23.1, "max_mr_id": 55 }
],
"edges": []
}
```
Multiple computed columns with different types (Int64 vs Float64). All inlined on the Project nodes, all listed in `columns`.
### 9.10 PathFinding (shortest)
Recursive CTE. Returns full path with intermediate nodes.
**Query:**
```json
{
"query_type": "path_finding",
"nodes": [
{ "id": "start", "entity": "User", "node_ids": [1] },
{ "id": "end", "entity": "Project", "node_ids": [200] }
],
"path": { "type": "shortest", "from": "start", "to": "end", "max_depth": 5 },
"limit": 10
}
```
**Response:**
```json
{
"columns": [],
"nodes": [
{ "type": "User", "id": 1, "username": "alice", "name": "Alice" },
{ "type": "MergeRequest", "id": 42, "title": "Fix bug", "state": "merged" },
{ "type": "Project", "id": 200, "name": "Omega", "full_path": "gitlab-org/omega" }
],
"edges": [
{ "from": "User", "from_id": 1, "to": "MergeRequest", "to_id": 42, "type": "AUTHORED", "path_id": 0, "step": 0 },
{ "from": "MergeRequest", "from_id": 42, "to": "Project", "to_id": 200, "type": "IN_PROJECT", "path_id": 0, "step": 1 }
]
}
```
Path structure is encoded in edges via `path_id` + `step`. Graph view uses nodes + edges directly. Table view can reconstruct path sequences by grouping edges by `path_id` and sorting by `step`.
### 9.11 PathFinding (all shortest, multiple paths)
**Query:**
```json
{
"query_type": "path_finding",
"nodes": [
{ "id": "start", "entity": "User", "node_ids": [1] },
{ "id": "end", "entity": "Project", "node_ids": [200] }
],
"path": { "type": "all_shortest", "from": "start", "to": "end", "max_depth": 5 },
"limit": 10
}
```
**Response:**
```json
{
"columns": [],
"nodes": [
{ "type": "User", "id": 1, "username": "alice" },
{ "type": "MergeRequest", "id": 42, "title": "Fix bug" },
{ "type": "Note", "id": 55, "title": "Design doc" },
{ "type": "Project", "id": 200, "name": "Omega" }
],
"edges": [
{ "from": "User", "from_id": 1, "to": "MergeRequest", "to_id": 42, "type": "AUTHORED", "path_id": 0, "step": 0 },
{ "from": "MergeRequest", "from_id": 42, "to": "Project", "to_id": 200, "type": "IN_PROJECT", "path_id": 0, "step": 1 },
{ "from": "User", "from_id": 1, "to": "Note", "to_id": 55, "type": "AUTHORED", "path_id": 1, "step": 0 },
{ "from": "Note", "from_id": 55, "to": "Project", "to_id": 200, "type": "CONTAINS", "path_id": 1, "step": 1 }
]
}
```
Two paths, same depth. User:1 and Project:200 appear once in `nodes` despite being in both paths. `path_id` distinguishes which edges belong to which path.
### 9.12 PathFinding (any, with relationship filter)
**Query:**
```json
{
"query_type": "path_finding",
"nodes": [
{ "id": "start", "entity": "User", "node_ids": [1] },
{ "id": "end", "entity": "File", "node_ids": [500] }
],
"path": { "type": "any", "from": "start", "to": "end", "max_depth": 4, "rel_types": ["AUTHORED", "IN_PROJECT", "CONTAINS"] },
"limit": 1
}
```
**Response:**
```json
{
"columns": [],
"nodes": [
{ "type": "User", "id": 1, "username": "alice" },
{ "type": "MergeRequest", "id": 42, "title": "Fix bug" },
{ "type": "Project", "id": 101, "name": "Alpha" },
{ "type": "File", "id": 500, "path": "app/controllers/sessions_controller.rb", "name": "sessions_controller.rb" }
],
"edges": [
{ "from": "User", "from_id": 1, "to": "MergeRequest", "to_id": 42, "type": "AUTHORED", "path_id": 0, "step": 0 },
{ "from": "MergeRequest", "from_id": 42, "to": "Project", "to_id": 101, "type": "IN_PROJECT", "path_id": 0, "step": 1 },
{ "from": "Project", "from_id": 101, "to": "File", "to_id": 500, "type": "CONTAINS", "path_id": 0, "step": 2 }
]
}
```
Mixed entity types along the path (User, MR, Project, File), each with its own properties from dynamic hydration.
### 9.13 Neighbors (both directions)
**Query:**
```json
{
"query_type": "neighbors",
"node": { "id": "center", "entity": "Project", "node_ids": [101] },
"neighbors": { "node": "center", "direction": "both" },
"limit": 30
}
```
**Response:**
```json
{
"columns": [],
"nodes": [
{ "type": "Project", "id": 101, "name": "Alpha", "full_path": "gitlab-org/alpha", "visibility_level": "internal" },
{ "type": "MergeRequest", "id": 42, "title": "Fix bug", "iid": 5, "state": "merged" },
{ "type": "User", "id": 1, "username": "alice", "name": "Alice", "state": "active" },
{ "type": "File", "id": 500, "path": "app/controllers/sessions_controller.rb", "name": "sessions_controller.rb", "language": "ruby" }
],
"edges": [
{ "from": "MergeRequest", "from_id": 42, "to": "Project", "to_id": 101, "type": "IN_PROJECT" },
{ "from": "User", "from_id": 1, "to": "Project", "to_id": 101, "type": "MEMBER_OF" },
{ "from": "Project", "from_id": 101, "to": "File", "to_id": 500, "type": "CONTAINS" }
]
}
```
The center node (Project:101) is just another node in the list. Edges carry the relationship types. Mixed entity types in neighbors, each with their own properties from dynamic hydration. Table: four entity tables stacked. Graph: star topology around the center.
### 9.14 Neighbors (filtered by relationship type)
**Query:**
```json
{
"query_type": "neighbors",
"node": { "id": "center", "entity": "User", "node_ids": [1] },
"neighbors": { "node": "center", "direction": "outgoing", "rel_types": ["AUTHORED"] },
"limit": 20
}
```
**Response:**
```json
{
"columns": [],
"nodes": [
{ "type": "User", "id": 1, "username": "alice", "name": "Alice", "state": "active" },
{ "type": "MergeRequest", "id": 42, "title": "Fix bug", "iid": 5, "state": "merged" },
{ "type": "Note", "id": 50, "title": "Design review" }
],
"edges": [
{ "from": "User", "from_id": 1, "to": "MergeRequest", "to_id": 42, "type": "AUTHORED" },
{ "from": "User", "from_id": 1, "to": "Note", "to_id": 50, "type": "AUTHORED" }
]
}
```
Only AUTHORED relationships. Neighbors are mixed types (MergeRequest, Note), filtered to outgoing from the center user.
---
## 10. Frontend rendering
### Ontology cache
On app load, fetch the ontology once and build lookup maps:
```javascript
let ontologyCache = null;
async function loadOntology() {
const schema = await getSchema();
ontologyCache = {
nodes: Object.fromEntries(schema.nodes.map(n => [n.name, n])),
edges: Object.fromEntries(schema.edges.map(e => [e.name, e])),
};
}
function getNodeMeta(entityType) {
return ontologyCache?.nodes[entityType] ?? {};
}
```
### Graph view
Nodes and edges go straight to Three.js. No extraction step needed -- the response is already graph-shaped.
```javascript
export function buildGraphData(response) {
const nodesById = new Map();
for (const node of response.nodes) {
const meta = getNodeMeta(node.type);
const compositeId = `${node.type}:${node.id}`;
nodesById.set(compositeId, {
id: compositeId,
label: node[meta.label_field] || node.name || node.title || node.username || String(node.id),
type: node.type.toLowerCase(),
domain: meta.domain || null,
style: meta.style || {},
properties: node,
});
}
const edges = response.edges.map(e => ({
source: `${e.from}:${e.from_id}`,
target: `${e.to}:${e.to_id}`,
type: e.type,
depth: e.depth,
}));
return { nodes: [...nodesById.values()], edges };
}
```
No `detectAliases`, no `buildNodeFromAlias`, no `extractPrefixedProperties`. The entire `graph_transform.js` file goes away.
### Table view (stacked entity tables)
Group nodes by type, render one `GlTable` per entity type:
```javascript
export function buildTableData(response) {
const grouped = {};
for (const node of response.nodes) {
if (!grouped[node.type]) grouped[node.type] = [];
grouped[node.type].push(node);
}
const computedColumns = new Set(
(response.columns || []).filter(c => c.aggregation).map(c => c.name)
);
const tables = Object.entries(grouped).map(([entityType, nodes]) => {
const meta = getNodeMeta(entityType);
const columns = Object.keys(nodes[0] || {}).filter(k => k !== 'type');
return { entityType, meta, columns, nodes, computedColumns };
});
return tables;
}
```
Each entity type becomes its own table. The frontend stacks them vertically. Edge tables can be inserted between entity tables to show relationships. Computed columns (from aggregation) are rendered alongside entity properties, with distinct formatting if needed.
### No per-query-type dispatch
Both `buildGraphData` and `buildTableData` work for all five query types without branching. The response shape is always the same -- the data inside varies, but the code doesn't need to care.
For path finding, the graph view renders all nodes and edges. If the frontend wants to highlight individual paths, it can group edges by `path_id`.
---
## 11. Implementation roadmap
### Server side (Rust, `gkg-server` crate)
#### Step 1: extend `ResultContext` with relationships
**File**: `crates/query-engine/src/enforce.rs`
Add `relationships` to `ResultContext` and populate from `input.relationships` during `enforce_return()`:
```rust
pub struct ResultContext {
pub query_type: Option<QueryType>,
nodes: HashMap<String, RedactionNode>,
entity_auth: HashMap<String, EntityAuthConfig>,
pub relationships: Vec<RelationshipRef>, // NEW
}
pub struct RelationshipRef {
pub rel_type: String,
pub from_alias: String,
pub to_alias: String,
pub min_hops: u32,
pub max_hops: u32,
}
```
One `InputRelationship` with multiple types (e.g., `types: ["AUTHORED", "REVIEWED"]`) fans out to multiple `RelationshipRef` entries.
#### Step 2: update proto
**File**: `proto/gkg.proto`
Update `ExecuteQueryResult` to match the sync decisions:
```protobuf
message ExecuteQueryResult {
string result_json = 1;
string query_type = 2;
repeated string raw_query_strings = 3;
int32 row_count = 4;
}
```
Remove `redacted_count`, `execution_time_ms`, and `generated_sql`. Replace `generated_sql` with `raw_query_strings` (array).
#### Step 3: build `GraphFormatter`
**New file**: `crates/gkg-server/src/query_pipeline/graph_formatter.rs`
Implements `ResultFormatter`. The formatting logic:
1. Iterate all authorized rows
2. For each row, extract node data by alias using `ctx.nodes()`, deduplicating by `(entity_type, id)` into a HashMap
3. For each row, build instance edges from `ctx.relationships` + row node IDs
4. For PathFinding: extract path nodes from `dynamic_nodes`, emit edges with `path_id` + `step`
5. For Neighbors: extract neighbor node from `dynamic_nodes`, emit edge with relationship type
6. For Aggregation: attach computed column values to the group-by node
7. Serialize as `{ columns, nodes, edges }`
#### Step 4: wire into pipeline
**File**: `crates/gkg-server/src/query_pipeline/service.rs`
Replace `RawRowFormatter` with `GraphFormatter` in the query pipeline. Keep `ContextEngineFormatter` for the LLM pipeline (GOON/TOON format). Update the gRPC service handler to populate the new proto fields (`query_type`, `raw_query_strings`, `row_count`) from pipeline output.
### Frontend side
#### Step 1: cache ontology on app load
`schema_page.vue` already fetches the schema. Expose it as a shared cache that `graph_explorer.vue` and the rendering functions use.
#### Step 2: delete `graph_transform.js`
Replace with the ~30 lines from `buildGraphData` (section 10). Display metadata (label_field, domain, style) comes from the cached ontology.
#### Step 3: add stacked table view
New component or refactored `query_results_table.vue` that calls `buildTableData`, renders one `GlTable` per entity type, and optionally shows edge tables between them.
#### Step 4: update graph components
- `graph_explorer.vue`: call `buildGraphData()` directly from response
- `three_graph.js`: nodes arrive with domain/label/style already resolved
- Path highlighting: group edges by `path_id` for visual emphasis
### Shared contract
A JSON Schema (draft-07) for the response envelope goes in the knowledge-graph repo at `crates/gkg-server/schemas/unified_response.json`, referenced by both server and frontend tests.
issue