CI: adjust what Secrets are whitelisted by leak-report tool
this MR refactors the whitelisting code to allow partial matches of following secret names after analyzed leaks report and Failed on the CI:
sensitive ones:
- rke2-capm3-virt-token :https://gitlab.com/sylva-projects/sylva-core/-/issues/1610
- rke2-capm3-virt-management-md, rke2-capm3-virt-management-cp, kubeadm-capm3-virt-management-cp, kubeadm-capm3-virt-management-md, rke2-capm3-virt-workload, kubeadm-capm3-virt-workload passwords and ironic htpasswd https://gitlab.com/sylva-projects/sylva-core/-/issues/1609
- csi-cephfs-secret: adminKey https://gitlab.com/sylva-projects/sylva-core/-/issues/1652
- gitea-postgres-replication: password https://gitlab.com/sylva-projects/sylva-core/-/issues/1665
non sensitive ones:
- "clientID"
- "LOKI_USERNAME"
- "adminID"
- key from gitea-keycloak-oidc-auth secret (it is the client ID of the keycloak client)
- config_environment.sh gitea. this script contains no sensitive data such as password or key
- usage-bootstrap-signing and usage-bootstrap-authentication from bootstrap-token as it is a boolean which indicates that the token should be used or not to sign the cluster-info ConfigMap of authenticate
- values-rancher-webhook-104.0.2-up0.5.2.yaml from helm-operation it contain some rancher webhook setup not sensitive
This MR also update: the check on the secret values. by adding a skip_secret_value function that return true if the secret is empty OR equal to an empty dictionnaryor is a binary
the leaks_check_report.py script list all kubernetes and vault secret of the cluster. It the build a regex from all their value.
to check the validity of the regex every secret is test on the regex. if one of the secret doesn't match the regex the script will failed.
this secret regex is then use to detect leaks by trying to find match on every pods logs of the running cluster.
there was still leaks not detected. if leaks are detected the script exit on error. the allow-failure: true has been removed resulting on the failure of the CI pipeline
we have analysis the leaks detected to identify if they are sensitive or not. if sensitive an issue has been created to track this leak and they have been added to the whitelist (list of the secret to ignore) wiaiting for the fix
if not sensitive they have just been added in the whitelist with no issue.
This MR :
- introduce also some additional docstring as suggested in the review.
- simplify the regex build and add comment to explain how it is built
- simplify the how kubernetes encoded secrets are managed. Secrets are now store in the secret list decoded to avoid having to pass encoded param to all function and having to decode them several times.
- replace rep by decode to manage bytes secrets
- add unitary regex validation. a regex is now validated toward its secret before being added to the secret_regex list
- gz secret are now skipped. They cannot be decoded in utf8 to be search in the logs.
- add a is_new_leak function to avoid duplicate. Leaks are now added only once in the report
current known limitation:
- secret corresponding to gz content are skipped
- secret leak are for now only tracked in the logs. We should extend the track to helmreleases, kustomizations and all other k8s ressources https://gitlab.com/sylva-projects/sylva-core/-/issues/1756
related to issue #1526 (closed)