chore(indexer): fold etl-engine, sdlc module and code module into a single indexer crate
What does this MR do and why?
Consolidates the etl-engine crate and the gkg-server::indexer module into a single indexer crate.
Previously, indexing logic was split across two places: etl-engine had the message processing engine (broker, handlers, destinations), while gkg-server/src/indexer/ had the domain modules (SDLC, Code) and the run() entrypoint. This meant gkg-server owned indexing concerns that don't belong to a webserver.
Now the indexer crate contains everything:
- The engine (unchanged: broker, handler registry, worker pool, destinations)
- The SDLC and Code domain modules (moved from
gkg-server/src/indexer/modules/) -
IndexerConfigandrun()(moved fromgkg-server/src/indexer/mod.rs) - Topic definitions (moved from
gkg-server/src/indexer/topic.rs)
gkg-server now depends on indexer directly and constructs an IndexerConfig from its own AppConfig before calling indexer::run(). The gkg-server::indexer module is deleted.
Also fixes a dependency naming collision: the cli crate was importing code-indexer under the name indexer. Renamed to code-indexer to free up the name.
All etl_engine:: imports across the codebase are updated to indexer::. Updated AGENTS.md and README.md to match.
New file tree
crates/indexer/src/
├── clickhouse/ # ClickHouse destination
├── modules/
│ ├── code/ # Git repository indexing via Gitaly
│ └── sdlc/ # SDLC entity indexing (MRs, CI, issues, etc.)
├── nats/ # NATS JetStream broker
├── testkit/ # Test mocks and builders
├── lib.rs # IndexerConfig, run(), IndexerError
├── ... # ETL engine files
└── worker_pool.rs
Testing
Integration and unit tests
Performance Analysis
- This merge request does not introduce any performance regression. If a performance regression is expected, explain why.