Skip to content

Handle force pushes in Code Indexer

What does this MR do and why?

While writing the ActiveContext Code Embeddings runbook, we realized that force-pushes are not yet handled in the the Embeddings Indexing pipeline. We eventually agreed that force-pushes should be handled as any reindex.

In this MR, we update the ActiveContext::Code::Indexer to detect force-pushes then trigger a force_reindex when running the gitlab-elasticsearch-indexer.

Solution Summary

  1. Add a reindexing field to the gitlab_active_context_code index (MR: !201428 (merged))
  2. Handle force reindexing in the Go Indexer (MR: gitlab-elasticsearch-indexer!704 (merged))
  3. In Rails, when there is a force-push, call the Go Indexer with options from_sha="" and force_reindex=true (this issue)

References

Screenshots or screen recordings

N/A - see validation steps below

How to set up and validate locally

Setup test project

On your local GDK, create a test project and add files.

In this example, I've created the project gitlab-duo/force-push-test with the following commits and files:

Expand for example project details
Initial commits Initial files
Screenshot_2025-08-28_at_11.28.32 Screenshot_2025-08-28_at_11.28.20

2 - Initial Indexing

  1. Ensure that you have enabled indexing for your project:

    Feature.enable(:active_context_code_index_project, Project.find(<id_of_selected_test_project>))
  2. Follow the setup tasks in #550418 (comment 2610944159) up to the "Run index_repository task" step.

    Note: ensure that the namespace you are processing is the same namespace of your test project

  3. Verify that the files are indexed in Elasticsearch correctly.

    Expand for Elasticsearch content

    Screenshot_2025-08-28_at_11.38.54

  4. Verify that the Ai::ActiveContext::Code::Repository record has the correct last_commit, e.g.:

    Expand for record check
    Ai::ActiveContext::Code::Repository.find_by(project_id: <id_of_test_project>)
    => <Ai::ActiveContext::Code::Repository:0x0000000100e5a640
    id: 6,
    project_id: 79,
    connection_id: 1,
    enabled_namespace_id: 1,
    metadata: {"initial_indexing_last_queued_item"=>"ead4dd4a7d5fe2e032b7dbf5b4559853079ed286c6689d3a3bb67facb9eac1a7"},
    last_commit: "3f039321fa0990b24e390d0f3d39c38fb6ce8fab",
    state: "embedding_indexing_in_progress",
    indexed_at: Thu, 28 Aug 2025 01:37:26.138089000 UTC +00:00,
    created_at: Thu, 28 Aug 2025 01:22:06.779727000 UTC +00:00,
    updated_at: Thu, 28 Aug 2025 01:37:26.139551000 UTC +00:00,
    initial_indexing_last_queued_item: "ead4dd4a7d5fe2e032b7dbf5b4559853079ed286c6689d3a3bb67facb9eac1a7",
    incremental_indexing_last_queued_item: nil,
    last_error: nil>

3 - Test Force Push

You can test the force push in 2 ways:

  • by following the validation steps in the Incremental Indexing MR: !201128 (merged) - this will allow the relevant workers to handle the incremental updates once you have done the force push
  • by manually running the IncrementalIndexingService once you have done the force push - we will go with this simpler test
  1. Update the test project with force pushes (make sure to delete 1 file for testing)

    Expand for example files and commits
    Commits Files
    Screenshot_2025-08-28_at_11.55.52 Screenshot_2025-08-28_at_11.55.21
  2. Run the IncrementalIndexingService

    In the rails console, run:

    r = Ai::ActiveContext::Code::Repository.find_by(project_id: <id_of_test_project>)
    Ai::ActiveContext::Code::IncrementalIndexingService.execute(r)
  3. Check active_context.log to verify that rails is calling the Go Indexer with the expected parameters of from_sha="", force_reindex=true:

    # in the gitlab root directory
    > tail -f log/active_context.log | grep "Ai::ActiveContext::Code::Indexer"
    
    # latest log output should be
    {"severity":"INFO","time":"2025-08-28T02:12:27.124Z","class":"Ai::ActiveContext::Code::Indexer","message":"Start indexer","ai_active_context_code_repository_id":6,"project_id":79,"from_sha":"","to_sha":"eca9608a1ae36c701b36ef1aea74727b671061db","force_reindex":true}
    {"severity":"INFO","time":"2025-08-28T02:12:27.492Z","class":"Ai::ActiveContext::Code::Indexer","message":"Indexer successful","ai_active_context_code_repository_id":6,"project_id":79,"from_sha":"","to_sha":"eca9608a1ae36c701b36ef1aea74727b671061db","force_reindex":true,"status":0}
  4. Verify that the files are indexed in Elasticsearch correctly

    Note that the deleted file from the force-push should not have documents in the index

    Expand for Elasticsearch content

    Screenshot_2025-08-28_at_12.04.31

  5. Verify that the Ai::ActiveContext::Code::Repository record has the correct last_commit, e.g.:

    Expand for record check
    Ai::ActiveContext::Code::Repository.find_by(project_id: <id_of_test_project>)
    => <Ai::ActiveContext::Code::Repository:0x000000015fe3c0a0
    id: 6,
    project_id: 79,
    connection_id: 1,
    enabled_namespace_id: 1,
    metadata:
      {"initial_indexing_last_queued_item"=>"ead4dd4a7d5fe2e032b7dbf5b4559853079ed286c6689d3a3bb67facb9eac1a7",
      "incremental_indexing_last_queued_item"=>"6cff0c6fcce7d9d3f7a31c693612d1abd18ed880e1106f23d25733ec9082a52c"},
    last_commit: "eca9608a1ae36c701b36ef1aea74727b671061db",
    state: "ready",
    indexed_at: Thu, 28 Aug 2025 02:01:32.367499000 UTC +00:00,
    created_at: Thu, 28 Aug 2025 01:22:06.779727000 UTC +00:00,
    updated_at: Thu, 28 Aug 2025 02:01:32.367782000 UTC +00:00,
    initial_indexing_last_queued_item: "ead4dd4a7d5fe2e032b7dbf5b4559853079ed286c6689d3a3bb67facb9eac1a7",
    incremental_indexing_last_queued_item: "6cff0c6fcce7d9d3f7a31c693612d1abd18ed880e1106f23d25733ec9082a52c",
    last_error: nil>

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #560713 (closed)

Edited by Pam Artiaga

Merge request reports

Loading