gcs cache in kubernetes executor support for workload identity
Description
Support workload identity based authorization to gcs from kubernetes executor
Proposal
Support workload identity based authorization to gcs from kubernetes executor - eliminating the need for external credentials to be supplied to the runner. This would make gcs cache's more secure and easier to configure. Users would use workload identity annotations on their gitlab runner deployment to link the kubernetes service account to the google IAM service account that has permissions to access the storage bucket.
Additional benefits are: This form of authorization for runners also works for other google resources e.g KMS key access for binary authorization etc It keeps authorization out of gitlab - easier to track credentials, less liability for gitlab for managing creds etc
When I try and not supply credentials atm i encountered two errors: Firstly it complained of the s3access secret as per gitlab-org/charts/gitlab-runner!177
After removing this, in the job log it showed No URL provided, cache will be not uploaded to shared cache server. Cache will be stored only locally. ++ echo 'Created cache'
Looking into the container logs i saw ERROR: error while resolving GCS credentials: GCS config present, but credentials are not configured[0;m https://gitlab.com/gitlab-org/gitlab-runner/blob/master/cache/gcs/credentials_resolver.go#L67
It looks like this could be amended to make supplying credentials optional ? It may be helpful to flag this with an additional config key in the values.yaml - this would signal to the code to not look for credentials, but also look for the required annotations for workload identity on the service account that the executor is running as kubectl annotate serviceaccount --namespace some-namespace default iam.gke.io/gcp-service-account=some-account@some-gcp-project.iam.gserviceaccount.com
https://gitlab.com/gitlab-org/charts/gitlab-runner/blob/master/values.yaml#L211
I've pulled together a PoC that demonstrates signing GCS URLs using a GCE / GKE metadata server provided token: https://gist.github.com/pdecat/80f21e36583420abbfdeae0494a53501
Note: for this work, the IAM service account must be granted the roles/iam.serviceAccountTokenCreator
on itself, in addition to IAM permissions on the target GCS bucket.
If this is accepted, I'm planning to submit an MR implementing this in gcsAdapter.presignURL()
: https://gitlab.com/gitlab-org/gitlab-runner/blob/v12.7.1/cache/gcs/adapter.go#L54 as a fallback if none of CredentialsFile
and AccessID
+PrivateKey
is provided.
An alternative would have been to use the GCS API instead of signed URLs to interact with GCS buckets but that would be way more invasive given the current cache adapter design.