CI: adjust what Secrets are whitelisted by leak-report tool (!2846) · Merge requests · Sylva-projects / sylva-core

this MR refactors the whitelisting code to allow partial matches of following secret names after analyzed leaks report and Failed on the CI:

sensitive ones:

rke2-capm3-virt-token :https://gitlab.com/sylva-projects/sylva-core/-/issues/1610
rke2-capm3-virt-management-md, rke2-capm3-virt-management-cp, kubeadm-capm3-virt-management-cp, kubeadm-capm3-virt-management-md, rke2-capm3-virt-workload, kubeadm-capm3-virt-workload passwords and ironic htpasswd https://gitlab.com/sylva-projects/sylva-core/-/issues/1609
csi-cephfs-secret: adminKey https://gitlab.com/sylva-projects/sylva-core/-/issues/1652
gitea-postgres-replication: password https://gitlab.com/sylva-projects/sylva-core/-/issues/1665

non sensitive ones:

"clientID"
"LOKI_USERNAME"
"adminID"
key from gitea-keycloak-oidc-auth secret (it is the client ID of the keycloak client)
config_environment.sh gitea. this script contains no sensitive data such as password or key
usage-bootstrap-signing and usage-bootstrap-authentication from bootstrap-token as it is a boolean which indicates that the token should be used or not to sign the cluster-info ConfigMap of authenticate
values-rancher-webhook-104.0.2-up0.5.2.yaml from helm-operation it contain some rancher webhook setup not sensitive

This MR also update: the check on the secret values. by adding a skip_secret_value function that return true if the secret is empty OR equal to an empty dictionnaryor is a binary

the leaks_check_report.py script list all kubernetes and vault secret of the cluster. It the build a regex from all their value.

to check the validity of the regex every secret is test on the regex. if one of the secret doesn't match the regex the script will failed.

this secret regex is then use to detect leaks by trying to find match on every pods logs of the running cluster.

there was still leaks not detected. if leaks are detected the script exit on error. the allow-failure: true has been removed resulting on the failure of the CI pipeline

we have analysis the leaks detected to identify if they are sensitive or not. if sensitive an issue has been created to track this leak and they have been added to the whitelist (list of the secret to ignore) wiaiting for the fix

if not sensitive they have just been added in the whitelist with no issue.

This MR :

introduce also some additional docstring as suggested in the review.
simplify the regex build and add comment to explain how it is built
simplify the how kubernetes encoded secrets are managed. Secrets are now store in the secret list decoded to avoid having to pass encoded param to all function and having to decode them several times.
replace rep by decode to manage bytes secrets
add unitary regex validation. a regex is now validated toward its secret before being added to the secret_regex list
gz secret are now skipped. They cannot be decoded in utf8 to be search in the logs.
add a is_new_leak function to avoid duplicate. Leaks are now added only once in the report

current known limitation:

secret corresponding to gz content are skipped
secret leak are for now only tracked in the logs. We should extend the track to helmreleases, kustomizations and all other k8s ressources https://gitlab.com/sylva-projects/sylva-core/-/issues/1756

related to issue #1526 (closed)

Edited Nov 06, 2024 by Samuel Bartel

CI: adjust what Secrets are whitelisted by leak-report tool

Merge request reports