(Part 1) Adding support for Knowledge Graph Indexing tasks

What does this MR do and why?

Addressing: #79 (closed)

This is Part 1 of a bigger MR that is to be split into 3.

Plan

  • Part 1 (this MR): Adding parsing of Knowledge Graph tasks
  • Part 2 (!590 (merged)) Adding Graph Indexer + Graph DB Client
  • Part 3: Adding clients for Knowledge Graph Bindings and Graph DB implementation.

You can see the MR that has all the three parts: The MR that has Parts (1-3) !563 (closed)

Context

As Zoekt Indexing tasks, we have added to the Rails App Knowledge Graph indexing tasks that are also exposed on the same endpoint as Zoekt Indexing tasks. See this MR. This MR implements the first step of pulling and parsing the tasks. Not's not indexing or creating graph databases yet on the Zoekt nodes.

What about all those int32 and int64 changes in the MR

In the Knowledge Graph related code, we should use int64 integers, because that's what we are using on the database as primary keys. Unfortunately, Zoekt/Search might be stuck with int32 until this Epic is resolved. So the common code that deals with those integers, like Locks, have been updated to int32, while keeping the Zoekt related code to int32.

How do the graph indexing tasks look like?

[
  {
    "name": "index_graph",
    "payload": {
      "GitalyConnectionInfo": {
        "Address": "unix:/Users/omar/gdk/praefect.socket",
        "Token": null,
        "Storage": "default",
        "Path": "@hashed/94/00/9400f1b21cb527d7fa3d3eabba93557a18ebe7a2ca4e471cfe5e4c5b4ca7f767.git"
      },
      "NamespaceId": 102,
      "RepoId": 19,
      "Callback": {
        "name": "index_graph",
        "payload": {
          "task_id": 71,
          "service_type": "knowledge_graph"
        }
      },
      "Timeout": "5400s"
    }
  }
]

How to set up and validate locally

  • Checkout this repo and run make build-unified
  • Enable the FF knowledge_graph_indexing on some project locally in GDK
  • Enable Knowledge Graph service on all the Zoekt Nodes via Rails Console Search::Zoekt::Node.update_all(services: [0, 1])
  • In GDK stop all the zoekt-indexer instances: gitlab-zoekt-indexer-development-1 and gitlab-zoekt-indexer-development-2. Run the process manually in a shell ./gitlab-zoekt-indexer/bin/gitlab-zoekt indexer -index_dir zoekt-data/development/index -listen :6080 -secret_path /Users/omar/gdk/gitlab-shell/.gitlab_shell_secret -self_url "http://localhost:6080" -search_url "http://localhost:6090" -gitlab_url http://gdk.test:3000. Make sure to change to change the paths.
  • (ALTERNATIVE) Keep gitlab-zoekt-indexer-development-1 running after you restart it. And run gdk tail gitlab-zoekt-indexer-development-1 to see the output.
  • Push some new code to the main branch of the project that you enabled the FF. You should see something like:
    {"time":"2025-07-28T16:20:50.355065+02:00","level":"INFO","msg":"Skipping indexing graph","namespace_id":102}
    in the output.

Feel free to ping me for any questions of something is not clear

Edited by Omar Qunsul

Merge request reports

Loading