This issue was created to continuously collect feedback around the GitLab Kubernetes Agent and any of its (missing) features.
The GitLab Agent for Kubernetes (GA4K or simply Agent) is the recommended way to connect GitLab to a Kubernetes cluster for deployments, security and soon observability.
The Agent was initially released in September 2019 for Self-Managed GitLab users and is available on gitlab.com since February 2020.
I am not actively using the Agent yet, but currently trying it out. It really improves the kubernetes integration in terms of security compared to the certificate approach, I think.
However, I following use cases I am not sure how to handle properly:
Using ci_access it's possible to authorize projects to use the runner, but is there also a way to restrict the KUBECONFIG variable to only be exposed in certain environments similar how it works with normal CI variables? Maybe I've just not found the configuration for this.
Many applications are packaged as Helm charts. How will it be possible to use GitOps with Helm charts? ArgoCD and Flux seem to define their own Helm Release type for this (see here and here), but I have not found something like this for the Gitlab Agent.
Viktor Nagy (GitLab)changed title from Feedback issue for the GitLab Kubernetes Agent and related features to Feedback issue for the GitLab Agent for Kubernetes and related features
changed title from Feedback issue for the GitLab Kubernetes Agent and related features to Feedback issue for the GitLab Agent for Kubernetes and related features
Installing GitLab Agent for Kubernetes in a normal project worked great without any issues; moving this project to group scope killed the deployment; I struggled with this Message
This job failed because the necessary resources were not successfully created.To find the cause of this error when creating a namespace and service account, check the logs.Reasons for failure include:The token you gave GitLab does not have cluster-admin privileges required by GitLab.Missing KUBECONFIG or KUBE_TOKEN deployment variables. To be passed to your job, they must have a matching environment:name. If your job has no environment:name set, the Kubernetes credentials are not passed to it.
There is a Link to a Configpart https://docs.gitlab.com/ee/user/project/clusters/deploy_to_cluster.html#troubleshooting but I'm lost there; I cannot figure out what's the problem; obviously something in relation to en automatically created environment and some permission to the cluster, but it did not change the config setup in comparison to the project in user scope.
I even created a complete new management project and a new GitLab Agent in the cluster without success. so please extend the configuration what's up there. Feel free to have a look at my project be1 generic cluster in be1group group.
I used the Kubernete'integration by certificat several years ago and I looked the Kubernetes Agent Service. A great means to use GitOps process !
For the "monitor" part, I seems (and my unknown of Kubernetes is perhaps the raison of this) to be more difficult with the Agent. I create a new project from the cluster management template (so helpful ) but I seems to need to create the link between services, likes prometheus to have monitoring in GitLab whereas before this integration was been native.
I really like the Gitlab Kubernetes Agent. The most useful feature is CI/CD tunnel. These things are welcome to see:
Official TF module to handle activities like registration, setup (I found one - but that's still not official - moreover I faced an issue while destroying it)
The current limitation that CI/CD tunnel can be exposed only within the same project tree forces to establish a setup that does not necessarily align with the platform (K8S) design.
Helm, Kustomize support would be a huge advantage.
There is a lack of Gitlab features that leverages Agent - there is nothing like recommendations on how to deploy applications either by using CI/CD tunnel or GitOps.
There is documentation about installing the GitLab agent into a kubernetes cluster. However, documentation on uninstalling the agent is missing. Such documentation should include removing the pod, node and namespace using kubectl or preferably using a subcommand in the cluster-integration/gitlab-agent/cli docker image.
Hi after playing around and migrating my existing productive (certificate based) cluster to gitlab agent I can say it's great guys! the only thing (but that's quite clear while it's in development) is that the documentation is not in the usual depth.
what's the way how to grant access for the gitlab agent in the project "a" to a helm chart in the private package registry of project "b" all in the same group.
sample:
I'm running different projects in a gitlab group
every project uses cicd and generates docker images stored in the container registry and helm charts stored in the gitlab package helm registry
now I want to install one of the self created helm packages (out of the private gitlab registry) via the gitlab agent.
in my helmfile of the cluster management project I wrote
- path: applications/circular/helmfile.yaml
the applications/cicular/helmfile.yaml looks like this
Error: looks like "https://gitlab.com/api/v4/projects/28875077/packages/helm/stable" is not a valid chart repository or cannot be reached: failed to fetch https://gitlab.com/api/v4/projects/28875077/packages/helm/stable/index.yaml : 401 Unauthorized
I would also say that I prefer this pull-based approach security-wise, but as some of the other posters have said, the documentation is still lacking. This is very much a bottleneck in our day-to-day operations.
E.g.
I get a warning that my k8s and agent versions are out of sync, and "click this link to learn how to upgrade"
From there it says, just copy and paste the command I used during install (I installed a while back, so I don't have that command)
Ok, so now how do I get that command?
Probably somehwere from the agent tab in the project?
Is it under "actions"?, No that just forces me to add another.
Please simplify the documentation as to how can I use kubectl commands inside the gitlab ci file (after connecting the agent). It is taking way too long for such a simple thing.
Many thanks for the GitLab Kubernetes Agent direction! For us, it is currently one of two core components to apply a true multi-tenancy setup to different shared Kubernetes clusters. The main advantages compared to GitLab Runners, from our point of view, are:
Centralized configuration as code of which groups/projects are authorized
Impersonation if centrally defined users/service accounts without the risk of leaking the KUBECONFIG/credentials
Possibility to decouple an agent from a runner/environment
Unfortunately, we already ran into performance issues/rate limits with our very first deployment test as an GitLab Ultimate customer on gitlab.com.
Error from server (InternalError): an error on the server ("<html><head>\n<meta http-equiv=\"content-type\" content=\"text/html;charset=utf-8\">\n<title>502 Server Error</title>\n</head>\n<body text=#000000 bgcolor=#ffffff>\n<h1>Error: Server Error</h1>\n<h2>The server encountered a temporary error and could not complete your request.<p>Please try again in 30 seconds.</h2>\n<h2></h2>\n</body></html>") has prevented the request from succeeding (get clusterrolebindings.rbac.authorization.k8s.io)
So two questions regarding this:
What are the currently applied limits on kas.gitlab.com and is there any option to increase them (apart from deploying multiple agents to a cluster)? In our current setup we planned 1 agent per cluster/environment.
Is there any option to deploy a self-managed standalone KAS and use this instance instead of kas.gitlab.com while staying on the GitLab Saas offering?
I like the idea of having the agent to provision the existing cluster, but sadly it does not work for me and I am searching since yesterday for a solution but could not find any. My agent simply wont start. Any hint would be welcome!
{"level":"info","time":"2022-03-06T20:42:05.241Z","msg":"Observability endpoint is up","mod_name":"observability","net_network":"tcp","net_address":"[::]:8080"}{"level":"info","time":"2022-03-06T20:42:05.242Z","msg":"Feature status change","feature_name":"tunnel","feature_status":true}{"level":"warn","time":"2022-03-06T20:42:25.228Z","msg":"GetConfiguration failed","error":"rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing failed to WebSocket dial: failed to send handshake request: Get \\\"https://kas.gitlab.com\\\": context deadline exceeded\""}{"level":"error","time":"2022-03-06T20:42:25.228Z","msg":"Error handling a connection","mod_name":"reverse_tunnel","error":"Connect(): rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing failed to WebSocket dial: failed to send handshake request: Get \\\"https://kas.gitlab.com\\\": context deadline exceeded\""}{"level":"error","time":"2022-03-06T20:42:25.229Z","msg":"Error handling a connection","mod_name":"reverse_tunnel","error":"Connect(): rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing failed to WebSocket dial: failed to send handshake request: Get \\\"https://kas.gitlab.com\\\": context deadline exceeded\""}{"level":"error","time":"2022-03-06T20:42:25.229Z","msg":"Error handling a connection","mod_name":"reverse_tunnel","error":"Connect(): rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing failed to WebSocket dial: failed to send handshake request: Get \\\"https://kas.gitlab.com\\\": context deadline exceeded\""}{"level":"error","time":"2022-03-06T20:42:25.229Z","msg":"Error handling a connection","mod_name":"reverse_tunnel","error":"Connect(): rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing failed to WebSocket dial: failed to send handshake request: Get \\\"https://kas.gitlab.com\\\": context deadline exceeded\""}{"level":"error","time":"2022-03-06T20:42:25.229Z","msg":"Error handling a connection","mod_name":"reverse_tunnel","error":"Connect(): rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing failed to WebSocket dial: failed to send handshake request: Get \\\"https://kas.gitlab.com\\\": context deadline exceeded\""}{"level":"error","time":"2022-03-06T20:42:25.229Z","msg":"Error handling a connection","mod_name":"reverse_tunnel","error":"Connect(): rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing failed to WebSocket dial: failed to send handshake request: Get \\\"https://kas.gitlab.com\\\": context deadline exceeded\""}{"level":"error","time":"2022-03-06T20:42:25.229Z","msg":"Error handling a connection","mod_name":"reverse_tunnel","error":"Connect(): rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing failed to WebSocket dial: failed to send handshake request: Get \\\"https://kas.gitlab.com\\\": context deadline exceeded\""}
The idea behind and configuration looks promising to me. I like the idea to have everything re-deployable from my repository. I really wish I could have it running
Thanks a lot GitLab! The agent is much better than the previous connection via the certificates.
The only issue I've seen so far is that group level authorization for agents don't work for me. It only works for the project the config.yaml is placed in.
I configured the agent according to this part of the docs with a very simple config looking like that:
ci_access:groups:-id:my-group-name
(I only specified the group name (not a full path), since we don't use subgroups.)
As said, it works for the project with the config.yaml but I can't add this agent in any other project of the group.
I installed the agent using the docker command I copied from my Omnibus Gitlab 14.8.2. My agent status is "Never connected"; I don't know why. I set gitlab_kas['enable'] = true in my gitlab.rb and ran gitlab-ctl reconfigure. My agent's logs are filled with:
{"level":"warn","time":"2022-03-16T15:02:50.609Z","msg":"GetConfiguration failed","error":"rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing failed to WebSocket dial: expected handshake response status code 101 but got 400\""}{"level":"error","time":"2022-03-16T15:04:27.449Z","msg":"Error handling a connection","mod_name":"reverse_tunnel","error":"Connect(): rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing failed to WebSocket dial: expected handshake response status code 101 but got 400\""}
I'm not sure what else I'm supposed to configure; obviously something's amiss. The agent can clearly communicate with Gitlab. There are a bunch of settings in gitlab-kas-config.yml that seem important, but I don't know if I should be messing with them or if that file is generated. A ps xa on the Gitlab container indicates that the kas-server is running.
I'm wondering how I can use the agent which I setup with one project successfully in other projects. I can't find anything in the docs.
I'm looking for a similar functionality as I had before with a cert-based group cluster.
I've found the transition to the agent from the certificate-based Kubernetes integration to be difficult. I'm still having a hard time redeveloping deployment pipelines that use kubectl calls from different groups. Similar to #230 and [#346566 (closed)), and because of #205 (I'm already using kpt v1.0), I seem to be blocked.
I'm happy to help update the kpt package if that is of any value.
Two pitfalls, I would have liked to see in the docs before finding them myself:
After connecting a Gitlab project Foo to a cluster and installing the agent there, granting ci_access to Gitlab project Bar, it can take a few minutes until the CI pipelines in Bar are able to connect to the cluster.
The paths of projects and groups which are used in the agent's config.yaml and as a kube-context in the CI pipelines are case-sensitive.
Found my way here through a link requesting feedback on the Agent and, I have to say, I agree with @edel-tkilian that the documentation is pretty terrible when it comes to a new user experience. I'll describe my use case, environment, and pain points as best I can.
First, a little background and environment. I'm a developer. I also administer an enterprise VMWare cluster, along with a small footprint AWS environment. I have a strong background in bare metal admin and VMWare, but AWS is a new animal for me. I've used it for a few EC2 instances and Route53 DNS, but nothing else. We have a self-hosted gitlab 14.8 (this will be updated this week) instance that is currently configured to do auto devops in our self-hosted K8S cluster, which is itself managed by Rancher 2.x.
This was setup some time ago and works well. We have recently decided to evaluate using EKS for some of our workloads.
After several false starts manually configuring networks, IAM roles, etc., I came across the "eksctl" tool which allowed me to quickly and easily create a new EKS cluster and the worker nodes to go along with it.
With the cluster created, I encountered the warning that the shared certificate method of adding a cluster to gitlab has been deprecated. I don't agree with this decision, but as I have no real choice in the matter, I've attempted to use the new (gitlab agent) method for K8S CI/CD integration.
This third page of instructions says to enable the gitlab_kas option and reconfigure gitlab, which was done.
Returning to the previous (agent install) instructions, the next step is to "Define a configuration repository" which states "To create an Agent, you need a GitLab repository to hold its configuration file. If you already have a repository holding your cluster's manifest files, you can use it to store your Agent's configuration file and sync them with no further steps."
This is immediately confusing as it's not clear what is meant by repository here. My assumption is that, in gitlab parlance, we're talking about a new "project". Further confusing the issue is the reference to my "cluster's manifest files" -- I have no such files for either the existing Rancher managed cluster, nor the newly created K8S cluster. I have no idea what these files are or where they would normally be. So, in my ignorance, I decided to simply create an entirely new project, along with the empty YAML file, and hope this is the correct thing to do.
At this extremely early stage to setting up the agent, I'm already on shaky ground. I don't know why I need an agent configuration file in the first place if a completely empty file is sufficient, nor do I know if putting it in a new project as I did is the correct thing to do, or if it's supposed to be in each of the projects I want to use with auto devops. The name I give to the cluster at this point, by naming the directory under .gitlab/agents/ is also of unknown use, so I'm just guessing at the purpose of a name that I may want to change later.
None of this is explained in the documentation -- at least, not on the above referenced pages that are telling you what you need to do to simply add an existing k8s cluster to gitlab. In short, the instructions tell you without much detail the "what" you need to do, but none of the "why" and little of the "how".
Moving on, I create the project and empty config.yaml and continue.
The next step is to "register the agent with gitlab" at which point I'm given the choice of either using a provided docker command, or an even more inscrutably complicated "advanced installation method." Clearly, given my complete ignorance about just what configuration options are available, let alone which options I may want (or indeed, need) to change, I would prefer to use the one-liner.
However, my local docker CLI is (obviously?) not configured to talk to the remote k8s cluster. The cluster was created entirely through AWS -- either through their UI, or via the AWS command line tools. I have never used the plain docker command to connect to the cluster, nor do I ever intend to. All of this management is done through the orchestrator, be that Rancher, the AWS UI, or the kubectl/helm command line utilities.
This is where I've arrived. At this point I think I need to open a shell on one of the EC2 instances that was created to run the EKS managed workload and run the docker command there, but I'm not certain, and given the lack of documentation I have no real choice but to just try it and see what happens.
---- expectations
My expectations were admittedly high, considering how much easier the other tools we use are to configure in similar circumstances. With a functioning connection to the k8s cluster, which I do have via kubectl, I would expect to simply perform some sort of 'helm install' to install the agent.
Ideally, I would expect the gitlab UI to be able to do this for me. After all, making complicated CLI tasks easy to perform from a GUI is the entire reason we use gitlab in the first place.
In any case, I hope that the stewards of the project can take a step back now and again and approach a task as though they are completely new users/admins who are given a task like this one, to see just how lackluster the documentation really is. I have a K8S cluster. I have a self hosted gitlab instance. I want to "configure gitlab to use the k8s cluster for autodevops". In my mind this should be a one or two step process in the GUI.
I'm hopeful that this is indeed the case in an updated version of gitlab, which I will be installing as soon as this post is done.
ETA: Well, this went differently than I had expected. After updating from 14.8 to 14.9, all mention of adding clusters via the agent is completely missing from the UI.
The installation of the agent worked well. The one liner command could use explanation that the docker container will run locally. For users of gitlab runners, the Agent install docs could link to Installing runner with Agent.
More in depth detail on how Agent manages K8S commands would help. Current documentation references around "allow access to where Kubernetes manifests are stored", and "Agent will monitor K8S manifests for changes" and redeploy. This paints a picture of agent monitoring deployment.yaml etc, directly in directory. Once moving to helm, this seems like a sub-optimal design. Alternatives that include helm support or traditional gitlab runner support should be linked.
This doc https://docs.gitlab.com/runner/install/kubernetes-agent.html is misleading that it would work for ci/cd workflow. The linked GitOps config section not tell you where to put the file ( touch more incomplete docs ). Likely will not use this gitops flow now though, seems not suited for Helm based deploys? Leaning towards deploy of runner 'manually' via helm.
Runner deploy:
The change to runner deploy with runner-manifest.yaml feels much more clean. Progress on Helm based runner and values.yaml, is very nice to see as previous implementations felt very custom. This is the best deployment route though. The example could get some more complete docs ( RBAC, service accounts, deploy manifest etc )
Thanks!