Add task request to the indexer
What does this MR do and why?
This MR adds an http polling mechanism to fetch tasks from rails. Every 10 seconds, it sends a GET
request to GitLab Internal API (right it is set to /api/v4/internal/search/zoekt/:node_uuid/tasks
).
For now we're not going to actually do anything with the response from rails so this initial implementation will simply provide a "heartbeat" so that rails can keep track of what indexers are online. Later rails will start sending indexing tasks back in the response to the indexer which will replace our current "push" based indexing where we have rails -> zoekt-indexer
requests to trigger indexing. Polling like this will also allow for async callbacks when indexing is finished but importantly it will also allow us to bring more zoekt indexers online without reconfiguring rails and rails can respond to the new indexers by allocating them some projects to index. The "heartbeat" functionality will then be used to detect indexers that have gone offline and reallocate their projects to another indexer that is online.
New arguments:
-
node_name
(defaults to hostname) -
self_url
- URL to reach the node from GitLab -
gitlab_url
- base URL to reach gitlab (http://localhost:3000
for example) -
secret_path
- path to the file containing shared secret to generate JWTs (see.gitlab_shell_secret
)
We start sending task requests if both gitlab_url
and self_url
are set.
Example task request:
GET /api/v4/internal/search/zoekt/3869fe21-36d1-4612-9676-0b783ef2dcd7/tasks?{QUERY_PARAMS}
QUERY_PARAMS:
node.url=http://localhost:6081
node.name=m1.local
disk.all=994662584320
disk.used=532673712128
disk.free=461988872192
How to set up and validate locally
Since the GitLab API hasn't been implemented, we can test it by:
- Replace
/api/v4/internal/search/zoekt/:node_uuid/tasks
with an existing internal API. For example,/api/v4/internal/lfs
.
diff --git a/internal/task_request/task_request.go b/internal/task_request/task_request.go
index 4e14429..4902e5c 100644
--- a/internal/task_request/task_request.go
+++ b/internal/task_request/task_request.go
@@ -112,7 +112,8 @@ func (h *taskRequestTimer) sendRequest() error {
}
defer h.lock.Unlock()
- taskRequestPath := fmt.Sprintf("/api/v4/internal/search/zoekt/%v/tasks", h.nodeUUID)
+ // taskRequestPath := fmt.Sprintf("/api/v4/internal/search/zoekt/%v/tasks", h.nodeUUID)
+ taskRequestPath := "/api/v4/internal/lfs"
fullURL, err := url.JoinPath(h.gitLabURL, taskRequestPath)
if err != nil {
return err
- Execute
make
- Execute
./bin/gitlab-zoekt-indexer -index_dir /tmp/ -listen :6081 -gitlab_url http://localhost:3000 -self_url http://localhost:6090
- Ensure that you see
401 Unauthorized
messages - Execute
./bin/gitlab-zoekt-indexer -index_dir /tmp/ -listen :6081 -gitlab_url http://localhost:3000 -self_url http://localhost:6090 -secret_path $GDK_DIR/gitlab/.gitlab_shell_secret
- Ensure that now you see
Error
responses not related to authentication, which means that authentication was successful and Rails accepted the JWT. We get error responses because the payload is incorrect, but it doesn't matter in this case.
Click to expand
❯ ./bin/gitlab-zoekt-indexer -index_dir /tmp/ -listen :6081 -gitlab_url http://localhost:3000 -self_url http://localhost:6090
2023/09/08 16:38:54 Starting heartbeat for 'm1.local' (http://localhost:6090) with 'http://localhost:3000' as GitLab URL
2023/09/08 16:38:54 Starting server on :6081 with '/indexer' as path prefix
2023/09/08 16:39:04 [401] {"message":"401 Unauthorized"}
2023/09/08 16:39:14 [401] {"message":"401 Unauthorized"}
❯ ./bin/gitlab-zoekt-indexer -index_dir /tmp/ -listen :6081 -gitlab_url http://localhost:3000 -self_url http://localhost:6090 -secret_path $GDK_DIR/gitlab/.gitlab_shell_secret
2023/09/08 16:40:05 Starting heartbeat for 'm1.local' (http://localhost:6090) with 'http://localhost:3000' as GitLab URL
2023/09/08 16:40:05 Starting server on :6081 with '/indexer' as path prefix
2023/09/08 16:40:15 [500] {"message":"500 Internal Server Error"}
2023/09/08 16:40:25 [500] {"message":"500 Internal Server Error"}