Skip to content

Set FromHash from zoektSHA for indexing

Terri Chu requested to merge tchu-fix-set-fromhash into main

Background

Related to #5 (closed)

While working on another issue, I found that the FromHash was not being set. This results in a full indexing occurring every time.

What this MR does

  • during indexing, grab the recorded zoekt SHA from the repository metadata
  • if the zoekt SHA is not found, set FromHash to "" to trigger a full repository index
  • if the zoekt SHA is found, ask Gitaly if it still exists before setting FromHash to the SHA (or "" if the SHA is gone which likely means a force push occurred)
  • added a custom error type

How to test

I manually tested this using gdk using the flightJS project to test with id of 7

Apply this diff to see print statements
diff --git a/internal/indexer/indexer.go b/internal/indexer/indexer.go
index 3836fe5..08a5784 100644
--- a/internal/indexer/indexer.go
+++ b/internal/indexer/indexer.go
@@ -3,6 +3,7 @@ package indexer
 import (
        "context"
        "errors"
+       "fmt"
 
        custom_error "gitlab.com/gitlab-org/gitlab-zoekt-indexer/internal"
        "gitlab.com/gitlab-org/gitlab-zoekt-indexer/internal/gitaly"
@@ -159,6 +160,8 @@ func (i *Indexer) indexRepository() error {
                i.gitalyClient.FromHash = ""
        }
 
+       fmt.Printf("FromHash %v\nToHash %v\n", i.gitalyClient.FromHash, i.gitalyClient.ToHash)
+
        err = i.gitalyClient.EachFileChange(putFunc, delFunc)
 
        if err != nil {

GDK

  1. stop zoekt indexer on gdk: gdk stop zoekt-dynamic-indexserver-development
  2. find the project directory from rails console: "#{Project.find(7).repository.disk_path}.git"

zoekt indexer

  1. run the server: make watch-run listen=:6061 index_dir=<REPLACE_WITH_GDK_DIR>/zoekt-data/development/index
  2. cleanup any existing indexed data from zoekt: curl -XPOST -H 'Content-Type: application/json' http://127.0.0.1:6061/indexer/truncate
  3. index the project
    ➜ curl -XPOST -d '{"GitalyConnectionInfo": {"Address": "unix:/<REPLACE_WITH_GDK_DIR>/praefect.socket", "Storage":    "default", "Path": "@hashed/79/02/7902699be42c8a8e46fbbb4501726517e86b22c56a189f7625a6da49081b2451.git"}, "RepoId":7, "FileSizeLimit": 2097152, "Timeout": "1h"}' -H 'Content-Type: application/json' http://127.0.0.1:6061/indexer/index
  4. note that FromHash is empty string, and ToHash is populated
  5. push a change to the project (either in the UI or git)
  6. index the project again
  7. note that FromHash and ToHash are populated
Edited by Terri Chu

Merge request reports