[Index state tracking: Rollout] Incremental updates
Description
We need to do incremental updates for git events, such as a merge commit, by running the Indexer and enqueueing embedding references.
In ee/app/services/ai/active_context/code/initial_indexing_service.rb, the repository record kicks off initial indexing, which calls the indexer, enqueues refs, and manages state.
For incremental updates, we need to do something similar but not change the states: once a repo is in :ready state, we don't change states again.
Ai::ActiveContext::Code::Indexer.run!(repository) calls the indexer with to_sha = project.repository.commit which picks up changes and deletions on gitaly so that it only indexes new updates and processes deletes.
JTBD
- Decide which git events to listen for to trigger incremental indexing: should it be a merge event, etc.?
- We should ensure this isn't run until the initial indexing is done. So either add a check that the repository
state=readyor put it behind a feature flag. - Hook into the event(s) and call something like
repository = Ai::ActiveContext::Code::Repository.find(...)
changed_ids = Ai::ActiveContext::Code::Indexer.run!(repository)
::Ai::ActiveContext::Collections::Code.track_refs!(hashes: changed_ids, routing: repository.project_id)
This will update the documents in the vector store and enqueue references which is processed async.