Fix General Issues with Current Example State
After having perused all of the issues in this repository, updated IAM permissions after failed subsequent runs, cleaned up AWS manually because the cleanup jobs fail, and so on, we finally received a working cluster, I guess (not sure how to use it in CI/CD, yet).
The documentation should probably come with a warning until GitLab can address the issues.
Environment:
- GitLab Runner 16.1.0 running on an EC2 instance, under Docker Compose, with the IAM policy shown below attached via an EC2 IAM Role.
- Environment variables applied via CI/CD Settings in the imported project:
TF_VAR_agent_token
TF_VAR_kas_address
Issues Noted:
-
The Terraform providers locked are incompatible with the validate
job as noted in #14.- We ran a
terraform init -upgrade
:- This upgraded the following providers:
- registry.terraform.io/hashicorp/aws to 5.8.0
- registry.terraform.io/hashicorp/cloudinit to 2.3.2
- registry.terraform.io/hashicorp/helm to 2.10.1
- registry.terraform.io/hashicorp/kubernetes to 2.22.0
- registry.terraform.io/hashicorp/tls to 4.0.4
- This automatically added provider "registry.terraform.io/hashicorp/time" (version 0.9.1) as a dependency
- This upgraded the following providers:
- We additionally upgraded terraform-aws-modules/eks/aws to 19.15.3 to fix resulting syntax errors.
- We ran a
-
With the above upgrades, data.tf
needed to usemodules.eks.cluster_name
for reaching the cluster. -
We had to manually create a VPC peering connection from my existing VPC, the necessary routing table entries between the two VPCs, and a security group entry in the gitlab-terraform-eks-cluster
SG for the existing GitLab Runner to finish the installation of the Helm chart for the GitLab Agent.- The Runner was resolving an internal address for the EKS endpoint. Maybe if I wasn't trying to create a cluster in the same AWS account this would be different?
- I cannot imagine how you would otherwise get to a new cluster created in a new VPC from a preexisting Gitlab Runner, which you would need to have in order to begin this work?
-
My user with full AdministratorAccess (AWS managed policy) permissions in IAM received the following warning in the EKS console: Your current IAM principal doesn’t have access to Kubernetes objects on this cluster. This might be due to the current IAM principal not having an access entry with permissions to access the cluster.
The documentation does not provide a recommendation on enhancing the
eks.tf
file to provide a User/Role ARN to allow for access. This will end up causing some level of support issue for operators. -
My user with full AdministratorAccess (AWS default policy) lacks access to KMS to delete secrets. - I do not believe this to be in-scope for this project, but it's noted here only as an issue when it came to cleaning up the repeated failures of this stack to get working.
IAM Policy
This is the IAM policy, originally based on the documentation, that we ended up with when we finally reached the point where our knowledge couldn't take us any further:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"ec2:*",
"eks:*",
"elasticloadbalancing:*",
"autoscaling:*",
"cloudwatch:*",
"logs:*",
"kms:CreateAlias",
"kms:CreateKey",
"kms:DescribeKey",
"kms:ListAliases",
"kms:DeleteAlias",
"iam:AddRoleToInstanceProfile",
"iam:AttachRolePolicy",
"iam:CreateInstanceProfile",
"iam:CreateOpenIDConnectProvider",
"iam:CreatePolicy",
"iam:CreateRole",
"iam:CreateServiceLinkedRole",
"iam:GetOpenIDConnectProvider",
"iam:GetPolicy",
"iam:GetPolicyVersion",
"iam:GetRole",
"iam:GetRolePolicy",
"iam:ListAttachedRolePolicies",
"iam:ListInstanceProfilesForRole",
"iam:ListPolicyVersions",
"iam:ListRolePolicies",
"iam:ListRoles",
"iam:PassRole",
"iam:PutRolePolicy",
"iam:TagOpenIDConnectProvider",
"iam:DetachRolePolicy",
"iam:DeleteOpenIDConnectProvider",
"iam:DeletePolicy",
"iam:DeleteRole",
"iam:DeleteRolePolicy"
],
"Resource": "*"
}
]
}
-
We were left with two warnings which need addressing from GitLab: - "PassRole With Star In Resource: Using the iam:PassRole action with wildcards (*) in the resource can be overly permissive because it allows iam:PassRole permissions on multiple resources. We recommend that you specify resource ARNs or add the iam:PassedToService condition key to your statement."
- "Create SLR With Star In Resource: Using the iam:CreateServiceLinkedRole action with wildcards (*) in the resource can allow creation of unintended service-linked roles. We recommend that you specify resource ARNs instead."