Vault Shared Key (Knowledge Graph <-> GitLab)
## Problem to solve
!224386 added `Analytics::KnowledgeGraph::JwtAuth`, which signs HS256 JWTs for requests between GitLab Rails and the GKG service. It uses the `Gitlab::JwtAuthenticatable` mixin, which auto-generates a local secret file (`.gitlab_knowledge_graph_secret`) on first boot.
That works in development but not in Kubernetes:
1. Each Rails pod generates its own secret on startup, so tokens from one pod won't verify on another.
2. The GKG service in `orbit-stg` needs the same key to verify incoming JWTs and sign outbound ones for Gitaly.
Both sides need to share one HS256 key.
The following discussion from !224386 should be addressed:
- [ ] @ggray-gitlab started a [discussion](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/224386#note_3109667381): (+1 comment)
> @michaelangeloio This has a secret being generated for a synchronous signing and written to disk in a file. Will other services need to be able to validate these tokens? If so, how will they access it? Additionally, if this is running in a cluster, how will we ensure that the secrets stay in sync, or is there an assumption that there will only ever be one instance we need to worry about?
## Proposed solution
Store one shared key per environment in Vault and distribute it to both clusters via ExternalSecrets. Staging and production use separate keys at environment-scoped paths under `k8s/shared/knowledge-graph/`.
### Architecture
```
Vault (k8s mount)
└── shared/knowledge-graph/stg/jwt ← staging key
│ └── key: <base64-encoded 32 bytes>
│ │
│ ├──► ESO in gstg-gitlab-gke
│ │ └──► K8s Secret "gitlab-knowledge-graph-jwt-v1"
│ │ in "gitlab" namespace
│ │ └──► mounted as file at .gitlab_knowledge_graph_secret
│ │ (read by JwtAuthenticatable)
│ │
│ └──► ESO in orbit-stg
│ └──► K8s Secret "gkg-secrets"
│ in "gkg" namespace
│ └──► mounted at /etc/secrets/gitlab/jwt/*
│ (read by GKG binary)
│
└── shared/knowledge-graph/prd/jwt ← production key (future)
```
Both sides read from the same Vault path via different Kubernetes auth mounts.
### Step 1 — Vault policies (config-mgmt)
**MR:** https://ops.gitlab.net/gitlab-com/gl-infra/config-mgmt/-/merge_requests/13377
**File:** `environments/vault-production/kubernetes.tf`
Two changes in this file:
**a) Grant orbit-stg read access**
Add a `gkg` auth role to the `orbit-stg` cluster block so the `gkg-secrets` service account in `gkg` namespace can read the shared path:
```terraform
gkg = {
service_accounts = ["gkg-secrets"]
namespaces = ["gkg"]
readonly_secret_paths = [
"k8s/shared/knowledge-graph/stg/jwt",
]
}
```
**b) Grant gstg Rails read access**
Append the shared path to the existing `gitlab` role's `readonly_secret_paths`:
```terraform
"k8s/shared/knowledge-graph/stg/jwt",
```
**File:** `environments/vault-production/secrets_policies.tf`
**c) Okta admin policy for staging secrets**
Grant the `knowledge_graph` Okta group admin access to staging secrets only, so the team can create and rotate the staging key via Vault UI:
```terraform
"shared/knowledge-graph/stg/*" = {
admin = {
groups = local.groups.knowledge_graph
}
}
```
### Step 2 — Write the secret to Vault
Generate and store the key (requires Vault admin access via Okta, granted by Step 1c):
```bash
KEY=$(openssl rand -base64 32)
vault kv put -mount=k8s shared/knowledge-graph/stg/jwt key="$KEY"
```
### Step 3 — Rails side ExternalSecret (k8s-workloads-gitlab-com)
**a) Create the ExternalSecret**
**File:** `releases/gitlab-external-secrets/values/values.yaml.gotmpl`
Add an entry under the existing `gitlab-shared-secrets` SecretStore:
```yaml
gitlab-knowledge-graph-jwt-v1:
refreshInterval: 0
secretStoreName: gitlab-shared-secrets
target:
creationPolicy: Owner
deletionPolicy: Delete
data:
- remoteRef:
key: knowledge-graph/stg/jwt
property: key
version: "1"
secretKey: knowledge_graph_jwt_shared_key
```
The `key` is relative to the SecretStore path. `gitlab-shared-secrets` has `path: shared`, so `knowledge-graph/stg/jwt` resolves to `k8s/data/shared/knowledge-graph/stg/jwt` in Vault.
**b) Mount the secret as a file**
`JwtAuthenticatable` reads secrets from a file, not env vars. This needs a companion MR to the GitLab Helm chart (`gitlab-org/charts/gitlab`).
Add a `_knowledge_graph.tpl` template:
```yaml
{{- define "gitlab.knowledgeGraph.mountSecrets" -}}
{{- if .Values.global.appConfig.knowledgeGraph.enabled -}}
- secret:
name: {{ .Values.global.appConfig.knowledgeGraph.secret }}
items:
- key: {{ .Values.global.appConfig.knowledgeGraph.key }}
path: knowledge_graph/.gitlab_knowledge_graph_secret
{{- end -}}
{{- end -}}
```
Then reference it in `releases/gitlab/values/gstg.yaml.gotmpl`:
```yaml
global:
appConfig:
knowledgeGraph:
enabled: true
secret: gitlab-knowledge-graph-jwt-v1
key: knowledge_graph_jwt_shared_key
```
### Step 4 — GKG side vault-secrets release (gitlab-helmfiles)
The GKG helm chart expects a K8s Secret via `secrets.existingSecret`. In the orbit-stg helmfile (https://gitlab.com/gitlab-com/gl-infra/k8s-workloads/gitlab-helmfiles/-/merge_requests/9971), this is set to `gkg-secrets`. That secret needs these keys:
| Key | Mount path | Purpose |
|-----|-----------|---------|
| `gitlab-jwt-verifying-key` | `/etc/secrets/gitlab/jwt/verifying_key` | Verify incoming JWTs from Rails |
| `gitlab-jwt-signing-key` | `/etc/secrets/gitlab/jwt/signing_key` | Sign outbound JWTs for Gitaly calls |
| `datalake-password` | `/etc/secrets/datalake/password` | ClickHouse datalake access |
| `graph-password` | `/etc/secrets/graph/password` | ClickHouse graph access |
With HS256, the verifying and signing keys hold the same value.
Add a vault-secrets release to `releases/gkg/helmfile.yaml.gotmpl`, following the `data-insights-platform` pattern:
```yaml
- name: gkg-secrets
chart: oci://registry.ops.gitlab.net/gitlab-com/gl-infra/charts/vault-secrets
version: ~1.9.0
namespace: gkg
installed: {{ .Values | get "gkg.installed" false }}
labels:
tier: inf
values:
- values-secrets/values.yaml.gotmpl
- values-secrets/{{ .Environment.Name }}.yaml.gotmpl
```
**File:** `releases/gkg/values-secrets/values.yaml.gotmpl`
```yaml
authMountPath: "kubernetes/{{ default .Values.cluster .Values.cluster_vault }}"
clusterLocation: "{{ .Values.region }}"
clusterName: "{{ .Values.cluster }}"
clusterProject: "{{ .Values.google_project }}"
secretStores:
- name: gkg-secrets
role: gkg
path: shared
serviceAccount:
name: gkg-secrets
```
**File:** `releases/gkg/values-secrets/orbit-stg.yaml.gotmpl`
```yaml
externalSecrets:
gkg-secrets:
refreshInterval: 0
secretStoreName: gkg-secrets
target:
creationPolicy: Owner
deletionPolicy: Delete
data:
- remoteRef:
key: knowledge-graph/stg/jwt
property: key
version: "1"
secretKey: gitlab-jwt-verifying-key
- remoteRef:
key: knowledge-graph/stg/jwt
property: key
version: "1"
secretKey: gitlab-jwt-signing-key
```
ClickHouse passwords (`datalake-password`, `graph-password`) come from separate Vault paths; add them once ClickHouse is provisioned for orbit-stg.
### Deployment order
1. Merge config-mgmt MR (Vault policies) — applied via Atlantis
2. Write the secret to Vault
3. Merge k8s-workloads-gitlab-com + charts/gitlab MRs (ExternalSecret + file mount for Rails)
4. Merge gitlab-helmfiles MR (vault-secrets release for GKG, extends https://gitlab.com/gitlab-com/gl-infra/k8s-workloads/gitlab-helmfiles/-/merge_requests/9971)
5. ArgoCD syncs, ESO pulls from Vault and creates K8s Secrets in both clusters
issue