[indexer] Schema version tracking with table prefix support
## Problem
GKG needs to track which schema version is active and derive table prefixes from it. This extends the V0 schema version tracking (Issue #426) with prefix derivation logic and a retention config.
## Proposal
### Schema version constant
Embed `SCHEMA_VERSION: u32` as a compile-time constant in the binary (initially `0`).
### ClickHouse control table
```sql
CREATE TABLE IF NOT EXISTS gkg_schema_version (
version UInt32,
status Enum8('active' = 1, 'migrating' = 2, 'retired' = 3, 'dropped' = 4),
created_at DateTime DEFAULT now()
) ENGINE = ReplacingMergeTree(created_at)
ORDER BY version;
```
This table survives across schema versions — it is never prefixed or dropped.
### Table prefix derivation
```rust
fn table_prefix(schema_version: u32) -> String {
if schema_version == 0 {
String::new() // no prefix for v0 (backward compatible)
} else {
format!("v{}_", schema_version)
}
}
fn prefixed_table_name(table: &str, schema_version: u32) -> String {
format!("{}{}", table_prefix(schema_version), table)
}
```
### Configuration
```yaml
schema:
max_retained_versions: 2 # total table sets to keep (default: 2)
```
With the default of 2: after migrating to v2, the indexer keeps v2 (active) + v1 (rollback target), and drops v0 tables automatically.
### Schema version file
Store the version in a dedicated file `config/SCHEMA_VERSION` containing just the integer (e.g. `0`). This is simpler to diff-check than a Rust constant buried in source code. The Rust binary reads this at compile time via `include_str!` and parses it into a `u32`.
### CI + lefthook enforcement
**CI job** (`schema-version-check`, added to `.gitlab-ci.yml` lint stage):
```bash
#!/usr/bin/env bash
# scripts/check-schema-version.sh
set -euo pipefail
BASE_REF="${CI_MERGE_REQUEST_DIFF_BASE_SHA:-origin/main}"
# Check if schema-affecting files changed in this MR
if git diff --name-only "$BASE_REF"...HEAD | grep -qE '^(config/graph\.sql|config/graph_local\.sql|config/ontology/)'; then
# Schema or ontology changed — SCHEMA_VERSION must also be bumped
if ! git diff "$BASE_REF"...HEAD -- config/SCHEMA_VERSION | grep -q '^+[0-9]'; then
echo "ERROR: config/graph.sql or config/ontology/ changed but config/SCHEMA_VERSION was not bumped."
echo "If this change affects the ClickHouse schema, bump the version."
echo "If this is a non-schema change (e.g. comments, formatting), you can skip this check"
echo "by adding [skip schema-version-check] to the MR description."
exit 1
fi
fi
echo "Schema version check passed."
```
The CI job follows the existing pattern (`agent-file-sync-check`, `ontology-schema-validate`) — lightweight alpine image, MR-only, lint stage. It also supports a `[skip schema-version-check]` escape hatch for non-schema ontology changes (e.g. description updates).
**Lefthook pre-commit hook** (added to `lefthook.yml`):
```yaml
- name: schema-version-check
run: ./scripts/check-schema-version.sh
glob:
- "config/graph.sql"
- "config/graph_local.sql"
- "config/ontology/**/*.yaml"
```
This gives developers immediate local feedback, matching the pattern used for `agent-file-sync` and `ontology-schema` checks.
### Startup behavior
All service modes (webserver, indexer, dispatcher) read the active schema version from `gkg_schema_version` on startup. If the table doesn't exist, it is created and version 0 is recorded as `active`.
## Acceptance criteria
- [ ] `SCHEMA_VERSION` stored in `config/SCHEMA_VERSION` file, read at compile time
- [ ] `gkg_schema_version` ClickHouse table created if not exists on startup
- [ ] `table_prefix()` and `prefixed_table_name()` functions implemented
- [ ] `schema.max_retained_versions` config setting with default of 2
- [ ] CI job (`schema-version-check`): fails MR if schema/ontology changes without version bump
- [ ] Lefthook pre-commit hook: same check locally
- [ ] Active schema version readable from ClickHouse by all service modes
- [ ] Unit tests for prefix derivation and version comparison
## Existing implementation to build on
- **!824** (`feat(indexer): add V0 schema version tracking and mismatch detection`) — already implements `SCHEMA_VERSION` constant, `gkg_schema_version` table, CI check script, periodic mismatch detection, and integration tests. V0.5 extends this with table prefix derivation, `max_retained_versions` config, and the `status` column in the control table.
- **!809** (`feat(migration): add distributed lock and reconciler`) — NATS KV distributed lock implementation that can be reused for the migration lock in Issue 3.
## Dependencies
Extends V0 Issue #426 (schema version tracking and mismatch detection)
## Blocks
- Issue 3: Table-prefix-aware indexer
- Issue 4: Table-prefix-aware web server
- Issue 5: Migration completion and cleanup
issue