@@ -237,13 +237,12 @@ The communication between GitLab and Zoekt nodes happens through bidirectional A
#### Authentication Architecture
The Zoekt integration implements a comprehensive authentication system with these key components:
The Zoekt integration implements a comprehensive JWT-based authentication system for all communication channels:
1.**Indexer → Rails Authentication (JWT)**: Zoekt indexer authenticates to GitLab Rails using JWT tokens signed with the GitLab shell secret
2.**Rails → Webserver Authentication (Basic Auth)**: GitLab Rails authenticates to Zoekt webserver using HTTP Basic Authentication via NGINX
3.**Future Planned Authentication (JWT)**: Plans to replace Basic Auth with JWT for Rails → Webserver authentication, mirroring the approach used for Indexer → Rails
1.**Indexer → Rails Authentication (JWT)**: Zoekt indexer authenticates to GitLab Rails using JWT tokens signed with the GitLab shell secret via the `Gitlab-Shell-Api-Request` header (implemented April 2025)
2.**Rails → Webserver Authentication (JWT)**: GitLab Rails authenticates to Zoekt webserver using JWT tokens signed with the GitLab shell secret via the `Gitlab-Zoekt-Api-Request` header (enforced [July 31, 2025](https://gitlab.com/gitlab-org/cloud-native/charts/gitlab-zoekt/-/merge_requests/122), replacing the previous Basic Auth approach)
This tiered authentication approach ensures secure communication in all directions while maintaining compatibility with GitLab's existing security patterns.
This unified JWT authentication approach ensures secure communication in all directions while maintaining compatibility with GitLab's existing security patterns. JWT authentication became mandatory in [gitlab-zoekt chart v3.0.0](https://gitlab.com/gitlab-org/cloud-native/charts/gitlab-zoekt/-/merge_requests/122)(July 2025) as part of [gitlab-org&17500](https://gitlab.com/groups/gitlab-org/-/epics/17500), providing better security with token expiry and consistency across all authentication channels.
#### Task Retrieval API
@@ -301,9 +300,7 @@ GitLab calls the Zoekt webserver API to:
GET /api/search
```
In deployed environments (particularly with Helm), this communication is secured with HTTP Basic Authentication configured in NGINX. This provides a simple but effective authentication layer for search requests.
The authentication approach for search is planned to transition to JWT-based authentication in the future, which will provide more granular control and better align with GitLab's authentication patterns.
This communication is secured with JWT authentication using the `Gitlab-Zoekt-Api-Request` header, providing secure and consistent authentication aligned with GitLab's authentication patterns. JWT authentication for search requests became mandatory in July 2025.
### Zoekt Infrastructure
@@ -313,7 +310,36 @@ A typical deployment includes:
- The `gitlab-zoekt` binary serving both indexing and search requests
- Universal CTags for symbol extraction
- An internal NGINX gateway for routing requests
- Gateway components for routing and authentication
#### Gateway Architecture
In Kubernetes/Helm deployments, the Zoekt infrastructure uses a multi-tier gateway architecture to handle communication between GitLab Rails and the Zoekt webserver:
1.**External Gateway (`zoekt-external-gateway`)**: Part of the deployment pod, provides the entry point for search operations from GitLab Rails. This gateway proxies requests and can optionally provide TLS termination.
2.**Internal Gateway (`zoekt-internal-gateway`)**: Part of the StatefulSet, provides an additional routing layer between the external gateway and the webserver instances. This gateway helps distribute requests across multiple webserver instances within the StatefulSet.
3.**Webserver (`zoekt-webserver`)**: The actual Zoekt search service that processes search queries against the index files and handles JWT authentication.
This three-tier architecture provides benefits including:
-**Load balancing**: Requests can be distributed across multiple webserver instances
-**TLS termination**: Optional TLS support at the gateway level
-**Operational flexibility**: Gateways can be scaled independently of webserver instances
-**Network isolation**: Additional network security through layered architecture
This structure enables high availability and load distribution while maintaining a clear organization of the relationship between namespaces, indices, nodes, and repositories.
### Current Development
#### Federated Search Using gRPC
A new [gRPC-based federated search capability](https://gitlab.com/gitlab-org/gitlab/-/issues/500087) is being developed to enhance search performance across multiple Zoekt nodes. This feature replaces the previous HTTP-based search proxying by using a more efficient gRPC streaming implementation.
#### JWT Authentication for Rails → Webserver
### Federated Search Using gRPC
In addition to the gRPC improvements, there are plans to implement JWT-based authentication for the Rails → Webserver communication flow. This would replace the current Basic Authentication approach with a more secure JWT implementation that mirrors the existing Indexer → Rails authentication. The goal is to provide a consistent, unified authentication strategy across all communication channels while leveraging GitLab's existing security infrastructure.
A [gRPC-based federated search capability](https://gitlab.com/gitlab-org/gitlab/-/issues/500087) was implemented in GitLab 18.0 (May 2025) to enhance search performance across multiple Zoekt nodes. This feature replaced the previous HTTP-based search proxying with a more efficient gRPC streaming implementation.
The gRPC federated search offers several advantages:
@@ -516,7 +536,7 @@ The gRPC federated search offers several advantages:
- Early termination once enough results are collected
- More efficient binary protocol with HTTP/2
- Processing results as they arrive rather than waiting for complete result sets
1.**Better resource management**: More granular control over search processing limits
1.**Better resource management**: More granular control over search processing limits, preventing the need to load excessive results (e.g., 10 nodes × 5,000 results = 50,000 results) with the previous JSON approach
It's important to note that while this implementation streams results between Zoekt nodes, the final results are still collected by the coordinating Zoekt node before being returned to Rails. The current implementation does not stream results to Rails. Instead, the performance benefits come from more efficient inter-node communication and the ability to stop searching once sufficient results are found, rather than exhaustively searching all repositories.
@@ -542,7 +562,7 @@ The rollout strategy has followed these steps:
- [x] Assessment of costs and performance for broader rollout
- [x] Continued performance improvements
- [x] Availability to the majority of licensed groups on GitLab.com
- [] General availability to all licensed groups on GitLab.com (pending)
- [x] General availability to all licensed groups on GitLab.com
For self-managed instances, administrators can enable Zoekt by installing the required components and enabling the feature in the admin area.