Migrate Infrastructure secrets from GCS+GKMS to GSM (Google Secret Manager)

As part of the ongoing Vault work, we decided to re-evaluate if Vault was indeed the direction we wanted to go in for simple secrets management across our infrastructure.

We did a spike to looking into Google Secret Manager, which is very similar service to what we currently do with GCP + GKMS , and after evaluating it vs vault at https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/11429 we have decided to instead discontinue working on Vault and move our secrets over to GSM instead.

In order to do this, the following tasks need to be done

  • Audit all projects for buckets that store secrets

    for i in gitlab-ci-155816 gitlab-ci-windows gitlab-org-ci-0d24e2 gitlab-ops gitlab-pre gitlab-production gitlab-release gitlab-staging-1 gitlab-testbed;do echo -ne "\n$i\n===\n";gsutil ls -p $i -l;done

  • Determine how to migrate existing bucket hierarchy to a flat predictable naming structure of secrets

    e.g.

    gs://gitlab-pre-secrets/gitlab-omnibus-secrets/pre.enc
    =>
    projects/gitlab-pre/secrets/chef_gitlab-omnibus-secrets_pre

    The only example I can find where more than 1 file is in a storage bucket is in consul

    gs://gitlab-pre-secrets/gitlab-consul/pre-client.enc
    =>
    projects/gitlab-pre/secrets/chef_gitlab-consul_pre-client

    We will prepend the chef secrets with chef_ so that we can easily find them later

  • Roll out new terraform configuration for GSM secrets in place of buckets.

    Make a new terraform module under gitlab-com-infrastructure/modules that takes a map of secrets and serviceaccounts that have access to them, e.g.

    secrets = {
      "chef_gitlab-omnibus-secrets_gprd" = {
        readers = [
          "terraform-ci@whatever.com",
          "someoneelse@who.com"
        ]
        project = "gitlab-production"
      }
    }
  • Make sure API secretmanager.googleapis.com is enabled in all projects as needed

  • Upgrade gcloud on all nodes to include a version that supports GSM

    Any of our VM infrastructure not running Ubuntu 18.04 or later will need to have gcloud utils upgraded (likely from the apt repo google provides)

  • Modify https://ops.gitlab.net/gitlab-cookbooks/gitlab_secrets/-/blob/master/libraries/secrets.rb to create a GitlabGSM class that can pull from Google secret manager

  • Write automation to be able to sync secrets between GCS+GKMS and GSM during the migration phase

    We will likely roll out GSM across environments in stages, so making sure we have tooling to keep things in sync during this period will be helpful

  • Go through environments and migrate their chef node attributes so they pull secrets from GSM

    We will do a change request to roll this out into gprd when ready

  • Modify https://ops.gitlab.net/gitlab-cookbooks/chef-repo/-/tree/master/bin gkms scripts to pull from GSM

    To keep backwards compatibility for the users editing secrets

  • Modify runbooks to include documentation about GSM

  • Modify k8s-workloads/gitlab-com to pull from GSM

    helmfile supports the following format ref+gcpsecrets://PROJECT/SECRET[?version=VERSION]#/yaml_or_json_key/in/secret which we can use directly in file

  • Modify k8s-workloads/gitlab-helmfiles to pull from GSM

    helmfile supports the following format ref+gcpsecrets://PROJECT/SECRET[?version=VERSION]#/yaml_or_json_key/in/secret which we can use directly in file

  • Confirm with SIRT that they are able to read new audit events for GSM

    And work with them on any other changes that need to be done

  • Evaluate GSM=>GKE integration releasing into GKE versions rolling out week of 2020-10-19

    See if this is a feasible roadmap for us to use for getting secrets into Kubernetes in the future. Upstream codebase at https://github.com/GoogleCloudPlatform/secrets-store-csi-driver-provider-gcp

  • Decomission everything Vault related

Edited by Graeme Gillies