chore(indexer): fold etl-engine, sdlc module and code module into a single indexer crate

What does this MR do and why?

Consolidates the etl-engine crate and the gkg-server::indexer module into a single indexer crate.

Previously, indexing logic was split across two places: etl-engine had the message processing engine (broker, handlers, destinations), while gkg-server/src/indexer/ had the domain modules (SDLC, Code) and the run() entrypoint. This meant gkg-server owned indexing concerns that don't belong to a webserver.

Now the indexer crate contains everything:

  • The engine (unchanged: broker, handler registry, worker pool, destinations)
  • The SDLC and Code domain modules (moved from gkg-server/src/indexer/modules/)
  • IndexerConfig and run() (moved from gkg-server/src/indexer/mod.rs)
  • Topic definitions (moved from gkg-server/src/indexer/topic.rs)

gkg-server now depends on indexer directly and constructs an IndexerConfig from its own AppConfig before calling indexer::run(). The gkg-server::indexer module is deleted.

Also fixes a dependency naming collision: the cli crate was importing code-indexer under the name indexer. Renamed to code-indexer to free up the name.

All etl_engine:: imports across the codebase are updated to indexer::. Updated AGENTS.md and README.md to match.

New file tree

crates/indexer/src/
├── clickhouse/        # ClickHouse destination
├── modules/
│   ├── code/          # Git repository indexing via Gitaly
│   └── sdlc/          # SDLC entity indexing (MRs, CI, issues, etc.)
├── nats/              # NATS JetStream broker
├── testkit/           # Test mocks and builders
├── lib.rs             # IndexerConfig, run(), IndexerError
├── ...                # ETL engine files
└── worker_pool.rs

Testing

Integration and unit tests

Performance Analysis

  • This merge request does not introduce any performance regression. If a performance regression is expected, explain why.
Edited by Jean-Gabriel Doyon

Merge request reports

Loading