Agent Not Found from the Kubernetes logs and Never connected on the GitLab UI
The Problem
Hi everyone,
I have a problem with my KAS agent, it shows agent not found
error like this:
{"level":"warn","time":"2022-04-22T09:26:11.868Z","msg":"GetConfiguration.Recv failed","error":"rpc error: code = NotFound desc = agent not found","correlation_id":"01G189FR3T2E7Y3NC80W7AJEDS"}
{"level":"error","time":"2022-04-22T09:26:53.274Z","msg":"Error handling a connection","mod_name":"reverse_tunnel","error":"rpc error: code = NotFound desc = agent not found","correlation_id":"01G189H0D0D1GJ9SF1ACN3429T"}
{"level":"warn","time":"2022-04-22T09:27:22.977Z","msg":"GetConfiguration.Recv failed","error":"rpc error: code = NotFound desc = agent not found","correlation_id":"01G189HY4PBY615BG62SKXJ6DJ"}
{"level":"error","time":"2022-04-22T09:27:29.084Z","msg":"Error handling a connection","mod_name":"reverse_tunnel","error":"rpc error: code = NotFound desc = agent not found","correlation_id":"01G189J4RTABX2853A0XMTG2K1"}
{"level":"error","time":"2022-04-22T09:29:00.093Z","msg":"Error handling a connection","mod_name":"reverse_tunnel","error":"rpc error: code = NotFound desc = agent not found","correlation_id":"01G189MZGD5ARZF4CCADAH4WHJ"}
{"level":"warn","time":"2022-04-22T09:29:07.505Z","msg":"GetConfiguration.Recv failed","error":"rpc error: code = NotFound desc = agent not found","correlation_id":"01G189N7HVEYX55TPPQN84B88R"}
{"level":"error","time":"2022-04-22T09:29:37.049Z","msg":"Error handling a connection","mod_name":"reverse_tunnel","error":"rpc error: code = NotFound desc = agent not found","correlation_id":"01G189P017741ZTAXSZH08KF3G"}
{"level":"error","time":"2022-04-22T09:32:58.109Z","msg":"Error handling a connection","mod_name":"reverse_tunnel","error":"rpc error: code = NotFound desc = agent not found","correlation_id":"01G189W4CEPTN3NESY1QE1R7VD"}
{"level":"error","time":"2022-04-22T09:33:55.397Z","msg":"Error handling a connection","mod_name":"reverse_tunnel","error":"rpc error: code = NotFound desc = agent not found","correlation_id":"01G189XWAS2YXASA91E6G8T1NF"}
{"level":"warn","time":"2022-04-22T09:34:45.060Z","msg":"GetConfiguration.Recv failed","error":"rpc error: code = NotFound desc = agent not found","correlation_id":"01G189ZCTNXYJRBB4667D2M3YV"}
I followed the instruction https://about.gitlab.com/blog/2021/09/10/setting-up-the-k-agent/ to set up the agent.
Below is my yaml file for the agent:
apiVersion: v1
kind: Namespace
metadata:
name: dadangnh-1
---
apiVersion: v1
kind: Namespace
metadata:
name: dadangnh-2
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: gka
namespace: gka
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: gka
namespace: gka
spec:
replicas: 1
selector:
matchLabels:
app: gka
template:
metadata:
labels:
app: gka
namespace: gka
spec:
serviceAccountName: gka
containers:
- name: ocp-djp
image: "registry.gitlab.com/gitlab-org/cluster-integration/gitlab-agent/agentk:stable"
args:
- --token-file=/config/token
- --kas-address
- wss://git.intranet.domainname/-/kubernetes-agent/
volumeMounts:
- name: token-volume
mountPath: /config
- name: ca-pemstore-volume
mountPath: /etc/ssl/certs/proxy_cert.crt
subPath: proxy_cert.crt
volumes:
- name: token-volume
secret:
secretName: gka-token
- name: ca-pemstore-volume
configMap:
name: ca-pemstore
items:
- key: proxy_cert.crt
path: proxy_cert.crt
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 0
maxUnavailable: 1
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: gka-write-cm
rules:
- resources:
- 'configmaps'
apiGroups:
- ''
verbs:
- create
- update
- delete
- patch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: gka-write-binding-cm
roleRef:
name: gka-write-cm
kind: ClusterRole
apiGroup: rbac.authorization.k8s.io
subjects:
- name: gka
kind: ServiceAccount
namespace: gka
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: gka-read-cm
rules:
- resources:
- 'configmaps'
apiGroups:
- ''
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: gka-read-binding-cm
roleRef:
name: gka-read-cm
kind: ClusterRole
apiGroup: rbac.authorization.k8s.io
subjects:
- name: gka
kind: ServiceAccount
namespace: gka
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: gka
name: gka-write
rules:
- resources:
- '*'
apiGroups:
- '*'
verbs:
- create
- update
- delete
- patch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
namespace: gka
name: gka-write-binding
roleRef:
name: gka-write
kind: Role
apiGroup: rbac.authorization.k8s.io
subjects:
- name: gka
kind: ServiceAccount
namespace: gka
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: gka
name: gka-read
rules:
- resources:
- '*'
apiGroups:
- '*'
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
namespace: gka
name: gka-read-binding
roleRef:
name: gka-read
kind: Role
apiGroup: rbac.authorization.k8s.io
subjects:
- name: gka
kind: ServiceAccount
namespace: gka
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: dadangnh-1
name: gka-write-dadangnh-1
rules:
- resources:
- '*'
apiGroups:
- '*'
verbs:
- create
- update
- delete
- patch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
namespace: dadangnh-1
name: gka-write-binding-dadangnh-1
roleRef:
name: gka-write-dadangnh-1
kind: Role
apiGroup: rbac.authorization.k8s.io
subjects:
- name: gka
kind: ServiceAccount
namespace: gka
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: dadangnh-1
name: gka-read-dadangnh-1
rules:
- resources:
- '*'
apiGroups:
- '*'
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
namespace: dadangnh-1
name: gka-read-binding-dadangnh-1
roleRef:
name: gka-read-dadangnh-1
kind: Role
apiGroup: rbac.authorization.k8s.io
subjects:
- name: gka
kind: ServiceAccount
namespace: gka
On the GitLab side, I already created the agent, but it shows never connected
:
I also reported this problem here gitlab-org/gitlab#342696 (comment 920763146)
Is there something I missed?
Context
Below is the detailed installation information:
- I use the latest GitLab v14.10.0 with Omnibus based installation.
- The GitLab and GitLab Runner are installed on Separated VM within Intranet network behind HTTP Proxy (for internet connection), and I use my own SSL CA (Thus on the Runner and Agent configuration, there is custom CA injected).
- The Kubernetes cluster is actually Openshift CLuster (v4.8.35) that also on Intranet network and use HTTP proxy for internet connection.
- Both feature on GitLab and GitLab Runner works normally behind proxy, like checking GitLab version update, using GitLab container registry, and running CI pipeline on GitLab runner.
- Deployment on Openshift also works normal behind proxy, like downloading container images from hub.docker.com, quay.io, etc.
- And for the token part, I already tried multiple times to make the agent, with copy and paste method so I don't think that the token is wrong.
Is there any method for me to help debugging this?
Also if I may ask some question, does the GitLab Agent need some port opened to the GitLab VM and/or GitLab Runner VM except the already opened port below?
I already opened port 22, 80, 443, 5000, 5050, and 8150
from the Openshift Cluster to GitLab VM as described here. But I am afraid there is network issue (port not opened) between GitLab VM and Openshift Cluster that makes the GitLab Agent not working.
And is there any requirement also from the GitLab VM to Openshift side?
Thank you