Add ability to leverage an external Vault
What does this MR do and why?
This MR allows to deploy a management cluster by relying on an External Vault (Community edition, or possibly OpenBao) (Sylva does not support Vault enterprise) rather than deploying one.
This MR assumes that the vault paths and URL are configurables (!4427 (merged) and !4451 (merged) respectively)
The MR depends on !6061 (merged) as well, since verifying TLS becomes crucial when communicating with an external vault.
Assumptions regarding the External Vault
Authentication
The External Vault identities adminstrator should be the only one responsible for managing human users accounts that are allowed to connect to. Hence we choose to not enable Sylva OIDC authentication against vault. Indeed, it would introduce a new authentication path out of the control of the External Vault identities adminstrator.
Limited privileges granted to Sylva
This MR introduces the possibility to configure the External Vault through the Vault API. As a consequence, the Sylva stack could create/configure all nessecary secret engines (kubernetes authentication and KV store), as well as the policies and roles. However, we do consider that this is giving too many privileges to the Sylva stack. Indeed, a compromised stack might compomise the External Vault and, consequently, all the platforms that depend on it.
This is why we chose to let the External Vault be responsible for creating and configuring its secret engine and k8s auth method.
The External Vault is only expected to provide an authentication token (that will be injected in the values of the Sylva stack) with CRUD rights on the path auth/<kubernetes auth path>/config.
The authentication token provided by the External Vault should be revoked by the later as soon as the Sylva deployment is completed.
Secret Lifecycle
During a deployment, the secrets of various units are created, via the CRD randomsecret, in the External Vault secret path. If the secret exists, it is not modified.
Required Vault Configuration
A key/value (kv) secrets engine, version 2, must be enabled on the external vault. the path name can be the default name secret or the custom name set in .Values.security.vault.path.secret
The kubernetes auth method must be enabled to allow some Sylva resources (randomsecret and the vault clustersecretstore) to authenticate with Vault using a Kubernetes Service Account Token. The access policies, secret-reader and secret-rw, with roles bounding these policies to the service account vault must be configured as well.
An example of External Vault configuration is given here: enable-vault-auth-k8s.sh
Time Synchronization
Use NTP to ensure that the External Vault and the management cluster nodes agree about what time it is. When a Sylva component authenticate to Vault, the later checks the nbf claim in JWT token issued from the service account vault, and if Vault has significant clock skew with Sylva control nodes, authentication will fail.
Debugging
The authentication can be tested with the following script (for debugging purpose): test-login-sample.sh
Expected Output:
$ ./test-login-sample.sh
++++++++++++++ Token Vault +++++++++++++++++
eyJhbGciO........
++++++++++++++ Token Vault Decoded +++++++++++++++++
Header:
{
"alg": "RS256",
"kid": "snvS0QMUV08-k9vUQelJvJSh_YL5V-uZA9oGtAjEbls"
}
Claims:
{
"aud": [
"https://kubernetes.default.svc.cluster.local"
],
"exp": 1758787440,
"iat": 1758783840,
"iss": "https://kubernetes.default.svc.cluster.local",
"jti": "64506005-e52b-4323-b70a-009adbbe6c5f",
"kubernetes.io": {
"namespace": "vault",
"serviceaccount": {
"name": "vault",
"uid": "1533d563-ed6b-440b-96ba-3af59f82afbe"
}
},
"nbf": 1758783840,
"sub": "system:serviceaccount:vault:vault"
}
++++++++++++++ Vault Login +++++++++++++++++
{
"request_id": "08896d89-006a-ae35-1c1d-50628fba295a",
"lease_id": "",
"renewable": false,
"lease_duration": 0,
"data": null,
"wrap_info": null,
"warnings": null,
"auth": {
"client_token": "hvs.CAESIN.......",
"accessor": "4p..........",
"policies": [
"default",
"secret-rw"
],
"token_policies": [
"default",
"secret-rw"
],
"metadata": {
"role": "secret-rw",
"service_account_name": "vault",
"service_account_namespace": "vault",
"service_account_secret_name": "",
"service_account_uid": "1533d563-ed6b-440b-96ba-3af59f82afbe"
},
"lease_duration": 3600,
"renewable": true,
"entity_id": "55adaee0-9bc9-2cf4-baa1-7d1002d50834",
"token_type": "service",
"orphan": true,
"mfa_requirement": null,
"num_uses": 0
}
}
Units modified
Basic: the deployment determines if it must rely on an External Vault if .Values.security.vault.external_vault_url is present.
-
vault-oidc: do not enable the unitvault-oidcwhen relying on an External Vault. -
eso-secret-stores: The field.spec.provider.vault.server.caProviderin the ClusterSecretStorevaultis modified: its CA provider can be either Sylva CA for the internal Vault or.Values.security.vault.external_vault_cafor the External Vault. To do that, the secret name.spec.provider.vault.server.caProvider.nameis changed tovault-ca. This secret is configured in the unitvault. -
The unit
vaultis not modified andvault-externalis introduced:
This new unit configures Sylva and the External Vault to allow the k8s resources ClusterSecretStore and RandomSecret to rely on the later:
- the External Vault is configured with a long life token, issued from the service account
token-reviewer-sa. This token is expected to be used by the External Vault to authenticate against the K8S cluster and to validate tokens submitted by its clients. - The secret
vault/vault-ca, used by the clustersecretstore vault, is configured from `.Values.security.vault.external_vault_ca``. - When the external vault does not support TLS (could happen in dev environment):
- The secret
vault-cais not created - The
CaProviderconfiguration is removed from thevault secretstore. - A kyverno policy is deployed to remove the field .
spec.connection.tLSConfigfrom the crd `RandomSecret.
- The secret
When relying on an External Vault two service accounts are defined so as not to mix roles:
- the service account
vault, used by sylva componentsrandomsecretandclustersecretstoreto authenticate against the External Vault - the service account
token-reviewer-sahas the ClusterRolesystem:auth-delegator. This role is granted to vault to allow the later to authenticate its clients connecting with ServiceAccounts from the management cluster.
$ kubectl --kubeconfig management-cluster-kubeconfig auth --as=system:serviceaccount:vault:vault can-i create tokenreview
no
$ kubectl --kubeconfig management-cluster-kubeconfig auth --as=system:serviceaccount:vault:token-reviewer-sa can-i create toke
nreview
yes
Related reference(s)
Closes issue #2262
Test coverage
- CI deployment to check that the MR does not break the default deployment relying on an internal KMS
- Deploy a capo/kadm management cluster relying on an External Vault
CI configuration
Below you can choose test deployment variants to run in this MR's CI.
Click to open to CI configuration
Legend:
| Icon | Meaning | Available values |
|---|---|---|
| Infra Provider |
capd, capo, capm3
|
|
| Bootstrap Provider |
kubeadm (alias kadm), rke2
|
|
| Node OS |
ubuntu, suse
|
|
| Deployment Options |
light-deploy, dev-sources, ha, misc, maxsurge-0, logging
|
|
| Pipeline Scenarios | Available scenario list and description |
-
🎬 preview☁️ capd🚀 kadm🐧 ubuntu -
🎬 preview☁️ capo🚀 rke2🐧 suse -
🎬 preview☁️ capm3🚀 rke2🐧 ubuntu -
☁️ capd🚀 kadm🛠️ light-deploy🐧 ubuntu -
☁️ capd🚀 rke2🛠️ light-deploy🐧 suse -
☁️ capo🚀 rke2🐧 suse -
☁️ capo🚀 kadm🐧 ubuntu -
☁️ capo🚀 rke2🎬 rolling-update🛠️ ha🐧 ubuntu -
☁️ capo🚀 kadm🎬 wkld-k8s-upgrade🐧 ubuntu -
☁️ capo🚀 rke2🎬 rolling-update-no-wkld🛠️ ha🐧 suse -
☁️ capo🚀 rke2🎬 sylva-upgrade-from-1.3.x🛠️ ha🐧 ubuntu -
☁️ capo🚀 rke2🎬 sylva-upgrade-from-1.3.x🛠️ ha,misc🐧 ubuntu -
☁️ capo🚀 rke2🛠️ ha,misc🐧 ubuntu -
☁️ capm3🚀 rke2🐧 suse -
☁️ capm3🚀 kadm🐧 ubuntu -
☁️ capm3🚀 kadm🎬 rolling-update-no-wkld🛠️ ha,misc🐧 ubuntu -
☁️ capm3🚀 rke2🎬 wkld-k8s-upgrade🛠️ ha🐧 suse -
☁️ capm3🚀 kadm🎬 rolling-update🛠️ ha🐧 ubuntu -
☁️ capm3🚀 rke2🎬 sylva-upgrade-from-1.3.x🛠️ ha🐧 suse -
☁️ capm3🚀 rke2🛠️ misc,ha🐧 suse -
☁️ capm3🚀 rke2🎬 sylva-upgrade-from-1.3.x🛠️ ha,misc🐧 suse -
☁️ capm3🚀 kadm🎬 rolling-update🛠️ ha🐧 suse -
☁️ capm3🚀 ck8s🎬 no-wkld🛠️ light-deploy,k8s-1.31🐧 ubuntu
Global config for deployment pipelines
-
autorun pipelines -
allow failure on pipelines -
record sylvactl events
Notes:
- Enabling
autorunwill make deployment pipelines to be run automatically without human interaction - Disabling
allow failurewill make deployment pipelines mandatory for pipeline success. - if both
autorunandallow failureare disabled, deployment pipelines will need manual triggering but will be blocking the pipeline
Be aware: after configuration change, pipeline is not triggered automatically.
Please run it manually (by clicking the run pipeline button in Pipelines tab) or push new code.