CI: adjust what Secrets are whitelisted by leak-report tool

this MR refactors the whitelisting code to allow partial matches of following secret names after analyzed leaks report and Failed on the CI:

sensitive ones:

non sensitive ones:

  • "clientID"
  • "LOKI_USERNAME"
  • "adminID"
  • key from gitea-keycloak-oidc-auth secret (it is the client ID of the keycloak client)
  • config_environment.sh gitea. this script contains no sensitive data such as password or key
  • usage-bootstrap-signing and usage-bootstrap-authentication from bootstrap-token as it is a boolean which indicates that the token should be used or not to sign the cluster-info ConfigMap of authenticate
  • values-rancher-webhook-104.0.2-up0.5.2.yaml from helm-operation it contain some rancher webhook setup not sensitive

This MR also update: the check on the secret values. by adding a skip_secret_value function that return true if the secret is empty OR equal to an empty dictionnaryor is a binary

the leaks_check_report.py script list all kubernetes and vault secret of the cluster. It the build a regex from all their value.

to check the validity of the regex every secret is test on the regex. if one of the secret doesn't match the regex the script will failed.

this secret regex is then use to detect leaks by trying to find match on every pods logs of the running cluster.

there was still leaks not detected. if leaks are detected the script exit on error. the allow-failure: true has been removed resulting on the failure of the CI pipeline

we have analysis the leaks detected to identify if they are sensitive or not. if sensitive an issue has been created to track this leak and they have been added to the whitelist (list of the secret to ignore) wiaiting for the fix

if not sensitive they have just been added in the whitelist with no issue.

This MR :

  • introduce also some additional docstring as suggested in the review.
  • simplify the regex build and add comment to explain how it is built
  • simplify the how kubernetes encoded secrets are managed. Secrets are now store in the secret list decoded to avoid having to pass encoded param to all function and having to decode them several times.
  • replace rep by decode to manage bytes secrets
  • add unitary regex validation. a regex is now validated toward its secret before being added to the secret_regex list
  • gz secret are now skipped. They cannot be decoded in utf8 to be search in the logs.
  • add a is_new_leak function to avoid duplicate. Leaks are now added only once in the report

current known limitation:

related to issue #1526 (closed)

Edited by Samuel Bartel

Merge request reports

Loading