ActiveContext code: indexing service and indexer
What does this MR do and why?
Updates the IndexingService to perform indexing for a repository record:
- Set the state to
:code_indexing_in_progress - Call the indexer
- Set the state to
:embedding_indexing_in_progress - Enqueue refs for ids returned from indexer
- Catches errors, logs and sets state to
:failed
Indexer:
- Raises errors if something went wrong so that the
IndexingServicecan mark the repository as failed - Invokes the GO elasticsearch-indexer with the right arguments and captures the response
- Extracts ids from the response using the
IndexerResponseModifierclass
IndexerResponseModifier:
- Processes the response from the indexer to extract ids
- The indexer has a section separator and if successful, streams ids in one section, e.g.:
--section-start--
version,build_time
v5.6.0-16-gb587744-dev,2025-06-24-0800 UTC
--section-start--
id
hash123
hash456
- The regex looks complicated but really isn't. The comment should explain it.
Screen_Recording_2025-06-25_at_10.08.22
References
[Index state tracking: Rollout] RepositoryIndex... (#545939 - closed)
How to set up and validate locally
- Enable indexing
- Create a repository record
- Add mock documents containing
:contentto the connected vector store (e.g. ES) - Update the elasticsearch indexer:
- Add the following to
Optionsininternal/mode/chunk/chunk.go:
- Add the following to
PartitionName string `json:"partition_name"`
PartitionNumber int `json:"partition_number"`
- Change the ids streamed from `internal/mode/chunk/chunk.go` to match the ids you indexed
- Run the indexing service:
Ai::ActiveContext::Code::IndexingService.execute(repository_record) - Execute the embedding queues:
ActiveContext.execute_all_queues! - Note that the embeddings are now set for the indexed documents
- Note that the repository is marked as
:embedding_indexing_in_progress - [Optional] Force the indexer to fail and note that the repository is marked as
:failed
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Related to #545939 (closed)
Edited by Madelein van Niekerk