[gkg] [server] FFI vs Dedicated Process
Problem to Solve
The current integration of the Rust-based Knowledge Graph (KG) functionality relies on a Foreign Function Interface (FFI) within the Go-based gitlab-zoekt-indexer service. For more details on this, see Use FFI to call knowledge graph indexer (gitlab-org/gitlab-zoekt-indexer#89 - closed)
The primary motivation for leveraging FFI to execute the Knowledge Graph code was for two reasons:
- Being able to ship this binary and service as part of Omnibus
- Potential Complexities with deployment as a result of the above
After meeting with @fzimmer @andrewn and @mbruemmer, we learned that we are committing to our segmentation strategy. This means the previous Omnibus constraints no longer apply to GKG.
Our current approach using FFI with the deployed version of the GitLab Knowledge Graph may run into several hurdles in the future, as raised and discussed in this sync meeting. Some high-level primary concerns with the FFI approach:
- Querying: The Knowledge Graph querying abilities require additional query building, connection establishment, and result type mapping. See this crate and this issue for more details. We have also raised concerns about unsafe FFI code here.
- Separation of Concerns: The Knowledge Graph will only increase in complexity for both indexing and querying, especially when we consider indexing server-side entities. Currently, we are also integrating some business logic inside the Zoekt Indexer project under the FFI constraints and Omnibus Constraints. This code can be avoided through a long-lived Rust process declared under a knowledge Graph Server Side crate, which shares common logic with the local binary.
- Upgrading Independently: Under the current FFI model, the Knowledge Graph requires two upgrade maintenance points for any API changes, the first being gitlab-org/rust/knowledge-graph and the second being gitlab-org/gitlab-zoekt-indexer. We should design our deployment solution to be upgradable outside of Rails and the Zoekt indexing service for future maintenance and feature delivery to customers.
The team needs to closely examine the tradeoffs of an FFI-based approach vs a separate processes approach to set ourselves up for a long-term solution.
Proposed Solution(s)
The team needs to discuss the best long-term solution in evaluating whether FFI or a dedicated GKG process is the best approach.
We can explore moving away from the FFI-based integration and adopting a dedicated process model, initially captured as an idea here. At a high level, this means the Knowledge Graph functionality could be encapsulated within its own long-running process and deployed as a "sidecar" container within the same Kubernetes pod as the Zoekt containers. The existing gitlab-zoekt-indexer service can evolve to act as a lightweight proxy and orchestrator, retaining its responsibilities for node registration, task management, and Gitaly interaction, while forwarding all KG-specific indexing and querying requests to the dedicated Rust process over an internal API.
For details on what this could look like, see this proposal: #168 (comment 2708406917)