Skip to content

Add task request to the indexer

Dmitry Gruzd requested to merge 424115-add-heartbeat into main

What does this MR do and why?

This MR adds an http polling mechanism to fetch tasks from rails. Every 10 seconds, it sends a GET request to GitLab Internal API (right it is set to /api/v4/internal/search/zoekt/:node_uuid/tasks).

For now we're not going to actually do anything with the response from rails so this initial implementation will simply provide a "heartbeat" so that rails can keep track of what indexers are online. Later rails will start sending indexing tasks back in the response to the indexer which will replace our current "push" based indexing where we have rails -> zoekt-indexer requests to trigger indexing. Polling like this will also allow for async callbacks when indexing is finished but importantly it will also allow us to bring more zoekt indexers online without reconfiguring rails and rails can respond to the new indexers by allocating them some projects to index. The "heartbeat" functionality will then be used to detect indexers that have gone offline and reallocate their projects to another indexer that is online.

New arguments:

  • node_name (defaults to hostname)
  • self_url - URL to reach the node from GitLab
  • gitlab_url - base URL to reach gitlab (http://localhost:3000 for example)
  • secret_path - path to the file containing shared secret to generate JWTs (see .gitlab_shell_secret)

We start sending task requests if both gitlab_url and self_url are set.

Example task request:

GET /api/v4/internal/search/zoekt/3869fe21-36d1-4612-9676-0b783ef2dcd7/tasks?{QUERY_PARAMS}

QUERY_PARAMS:

node.url=http://localhost:6081
node.name=m1.local
disk.all=994662584320
disk.used=532673712128
disk.free=461988872192

gitlab#424115 (closed)

How to set up and validate locally

Since the GitLab API hasn't been implemented, we can test it by:

  1. Replace /api/v4/internal/search/zoekt/:node_uuid/tasks with an existing internal API. For example, /api/v4/internal/lfs.
diff --git a/internal/task_request/task_request.go b/internal/task_request/task_request.go
index 4e14429..4902e5c 100644
--- a/internal/task_request/task_request.go
+++ b/internal/task_request/task_request.go
@@ -112,7 +112,8 @@ func (h *taskRequestTimer) sendRequest() error {
 	}
 	defer h.lock.Unlock()
 
-	taskRequestPath := fmt.Sprintf("/api/v4/internal/search/zoekt/%v/tasks", h.nodeUUID)
+	// taskRequestPath := fmt.Sprintf("/api/v4/internal/search/zoekt/%v/tasks", h.nodeUUID)
+	taskRequestPath := "/api/v4/internal/lfs"
 	fullURL, err := url.JoinPath(h.gitLabURL, taskRequestPath)
 	if err != nil {
 		return err
  1. Execute make
  2. Execute ./bin/gitlab-zoekt-indexer -index_dir /tmp/ -listen :6081 -gitlab_url http://localhost:3000 -self_url http://localhost:6090
  3. Ensure that you see 401 Unauthorized messages
  4. Execute ./bin/gitlab-zoekt-indexer -index_dir /tmp/ -listen :6081 -gitlab_url http://localhost:3000 -self_url http://localhost:6090 -secret_path $GDK_DIR/gitlab/.gitlab_shell_secret
  5. Ensure that now you see Error responses not related to authentication, which means that authentication was successful and Rails accepted the JWT. We get error responses because the payload is incorrect, but it doesn't matter in this case.
Click to expand
❯ ./bin/gitlab-zoekt-indexer -index_dir /tmp/ -listen :6081 -gitlab_url http://localhost:3000 -self_url http://localhost:6090
2023/09/08 16:38:54 Starting heartbeat for 'm1.local' (http://localhost:6090) with 'http://localhost:3000' as GitLab URL
2023/09/08 16:38:54 Starting server on :6081 with '/indexer' as path prefix
2023/09/08 16:39:04 [401] {"message":"401 Unauthorized"}
2023/09/08 16:39:14 [401] {"message":"401 Unauthorized"}
❯ ./bin/gitlab-zoekt-indexer -index_dir /tmp/ -listen :6081 -gitlab_url http://localhost:3000 -self_url http://localhost:6090 -secret_path $GDK_DIR/gitlab/.gitlab_shell_secret
2023/09/08 16:40:05 Starting heartbeat for 'm1.local' (http://localhost:6090) with 'http://localhost:3000' as GitLab URL
2023/09/08 16:40:05 Starting server on :6081 with '/indexer' as path prefix
2023/09/08 16:40:15 [500] {"message":"500 Internal Server Error"}
2023/09/08 16:40:25 [500] {"message":"500 Internal Server Error"}
Edited by Dylan Griffith

Merge request reports

Loading