ActiveContext: check for ready repositories

What does this MR do and why?

Adds a new task for ActiveContext SchedulingService to check for repositories that should be marked as ready.

Repositories in embedding_indexing_in_progress state can be there a while since embeddings are set asyncronously. We therefore need a way to determine when the initial embeddings are finished. For this we use the queued items.

  • Adds initial_indexing_last_queued_item field to repository.metadata
  • During initial indexing, after enqueueing references for embedding generation, we set this to the last enqueued id
  • Adds an event MarkRepositoryAsReadyEvent which runs every hour to check for repositories in embedding_indexing_in_progress state
    • Does a search for the id and checks if the currently indexing embedding model fields are populated
    • If yes, set the repository as ready
    • Happens in batches
  • Adds a feature flag active_context_code_event_mark_repository_ready to roll out adding the event

I.e. when initial indexing kicks off, the state changes to embedding_indexing_in_progress and embeddings are added to the queue. If we come back in an hour and find that the last queued item at initial index time has all the embedding fields populated, we know initial indexing is done.

References

How to set up and validate locally

  • Enable indexing
  • Enable the feature flag Feature.enable(:active_context_code_event_mark_repository_ready)
  • Create a repository in pending state: Ai::ActiveContext::Code::Repository.create!(project: Project.first, active_context_connection: Ai::ActiveContext::Connection.active, enabled_namespace: Ai::ActiveContext::Connection.active.enabled_namespaces.first)
  • Run the index_repository task: Ai::ActiveContext::Code::SchedulingWorker.new.perform("index_repository")
  • Execute queues until there are no more queued items: ActiveContext.execute_all_queues!
  • Run the mark_repository_as_ready task: Ai::ActiveContext::Code::SchedulingWorker.new.perform("mark_repository_as_ready")
  • Check the repository record: state = :ready, indexed_at is set

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #545941 (closed)

Edited by Madelein van Niekerk

Merge request reports

Loading