Support for Labkit Continuous Profiling
What
Create IAM policies (and potentially GCP Service Accounts) to allow services to access Google Cloud Profiler.
Why
Provide operational insights about the production service to the development team. In a recent DX survey, the question "When I investigate production issues, it's easy for me to debug." scored remarkably low. Runway can help improve the developer experience by making continuous profiling easy.
For example, PVS uses labkit/monitoring
which supports Continuous Profiling.
How
Implementation depends on the runtime.
Note
I suggest we implement this for Cloud Run and GKE first, because the implementations are much simpler. EKS we should only do if there is request from an existing user due to the implementation complexity.
Cloud Run
-
Provisioner: Grant the roles/cloudprofiler.agent
role to the Cloud Run service accountcrun-${runway_service_id}
.
GKE
-
Provisioner: Create a GCP Service Account for each Runway service. -
Provisioner: Create an IAM policy granting the roles/cloudprofiler.agent
role to the service's GCP Service Account. -
Provisioner: Grant the Kubernetes Service Account permission to act as the GCP Service Account using Workload Federation.
EKS
-
Provisioner: Provision a "Workload Identity Pool" (one each for gitlab-runway-staging
andgitlab-runway-production
). -
Provisioner: Provision the "AWS Identity Provider" within the Pool. -
Provisioner: Add the GCP Project Number (e.g. "gitlab-runway-production"), the GCP Project Number (e.g. 371965853757) and the Workload Identity Pool ID to the flux-cluster-vars
ConfigMap. -
Provisioner: Create a GCP Service Account for each Runway service as eks-${runway_service_id}
. -
Provisioner: Bind the GSA to the specific AWS identity using the roles/iam.workloadIdentityUser
role.Format:
principal://iam.googleapis.com/projects/${gcp_project_number}/locations/global/workloadIdentityPools/${gcp_workload_identity_pool}/subject/arn:aws:iam::${aws_account_id}:role/EKS-GCP-Role-${runway_serviceid}
-
Provisioner: Create an IAM policy granting the roles/cloudprofiler.agent
role to the service's GCP Service Account. -
Provisioner: Add a per-service AWS IAM Role (e.g. EKS-GCP-Role-${runway_service_id}
)# eks_trust_policy.json or inline JSON { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Federated": "arn:aws:iam::${aws_account_id}:oidc-provider/${var.eks_oidc_issuer_url}" }, "Action": "sts:AssumeRoleWithWebIdentity", "Condition": { "StringEquals": { # Restricts assumption to the specific K8s Service Account in its namespace "${var.eks_oidc_issuer_url}:sub": "system:serviceaccount:${var.runway_service_id}:${var.runway_service_id}" } } } ] }
-
runway-eks Helm chart: Add a ConfigMap
with the following content:{ "type": "external_account", "audience": "//iam.googleapis.com/projects/${gcp_project_number}/locations/global/workloadIdentityPools/${gcp_wip_id}/providers/EKS-Provider", "service_account_impersonation_url": "https://iamcredentials.googleapis.com/v1/projects/${gcp_project_id}/serviceAccounts/eks-${runway_service_id}@${gcp_project_id}.iam.gserviceaccount.com:generateAccessToken", "credential_source": { "environment_id": "aws1", "region_url": "http://169.254.169.254/latest/meta-data/placement/availability-zone", "url": "http://169.254.169.254/latest/meta-data/iam/security-credentials" } }
Note:
${runway_service_id}
must be substituted by Helm. The other placeholders will be substituted from the above ConfigMap by Flux. -
runway-eks Helm chart: Annotate the ServiceAccount
resource:eks.amazonaws.com/role-arn: "arn:aws:iam::${aws_account_id}:role/EKS-GCP-Role-${runway_service_id}"
Note:
${runway_service_id}
must be substituted by Helm. The other placeholders will be substituted from the above ConfigMap by Flux. -
runway-eks Helm chart: Add the ConfigMap
to theDeployment
as a volume mount, so that the credentials file is available as/etc/gcp-service-account/credentials.json
. -
runway-eks Helm chart: Set the GOOGLE_APPLICATION_CREDENTIALS
environment variable to/etc/gcp-service-account/credentials.json
.