feat(ontology): support per-edge destination_table for multiple edge tables

What does this MR do and why?

Part of #206 (closed). Edge tables are now defined as named objects in schema.yaml with per-table sort keys and columns, replacing the single edge_table / edge_sort_key / edge_columns fields. Edge definitions can specify which table they belong to.

This MR is current a no-op. All existing edges default to gl_edge. No compiler, indexer, or DDL changes needed until a second edge table is actually added, which is spec'd out in this snippet: https://gitlab.com/gitlab-org/orbit/knowledge-graph/-/snippets/5976958.

Schema changes

# Before
settings:
  edge_table: "gl_edge"
  edge_sort_key: [traversal_path, source_id, ...]
  edge_columns: [...]

# After
settings:
  default_edge_table: gl_edge
  edge_tables:
    gl_edge:
      sort_key: [traversal_path, source_id, ...]
      columns: [...]

Edge YAMLs can override the table:

# config/ontology/edges/defines.yaml
description: Definition relationships in source code
table: gl_code_edge   # optional, defaults to settings.default_edge_table

Ontology API

ontology.edge_table()              // -> default edge table name
ontology.edge_tables()             // -> all edge table names
ontology.is_edge_table(table)      // -> bool
ontology.edge_table_config(table)  // -> Option<&EdgeTableConfig> (sort_key + columns)
ontology.sort_key_for_table(table) // -> per-table sort key resolution

Compiler

CompilerMetadata is populated from the ontology during normalize:

input.compiler.edge_tables         // HashSet<String> - all edge table names
input.compiler.default_edge_table  // String - default for new edge scans

Dedup and optimizer passes use input.compiler.edge_tables for comparison checks instead of the EDGE_TABLE constant. Construction sites (TableRef::scan) still use the constant -- switching those to input.compiler.default_edge_table and per-relationship table routing is follow-up work.

Edited by Michael Usachenko

Merge request reports

Loading