Skip to content

Remote Development Setup: Workspaces not creating until agent reinstalled

Summary

I followed the workspace setup docs, creating a cluster through to installing the agent.

In my group, I used a fork of a 'working' remote dev example project to test things and get them going.

After following the guide, I was unable to create a workspace - the workspace page would simply 'spin' after attempting to create the workspace. Deletion also left the workspace with a 'spinning' activity icon.

Upon removal and recreation of the gitlab agent in the cluster, workspace creation was successful.

Steps to reproduce

  1. Following the prerequisite guide at workspace setup, I created a new GKE cluster.
  2. Followed the gitlab-workspaces-proxy installation
  3. Followed the instructions to install the agent for Kubernetes - leaving config.yaml empty for the moment.
helm repo add gitlab https://charts.gitlab.io
helm repo update
helm upgrade --install rdev gitlab/gitlab-agent \
    --namespace gitlab-agent-rdev \
    --create-namespace \
    --set image.tag=v16.1.1 \
    --set config.token=xxx \
    --set config.kasAddress=wss://kas.gitlab.com 
  1. Added the remote_development section into the agent's config.yaml file
  2. Attempted to create a workspace, resulting in a 'spinning' progress indicator, but not much else happening.
  3. Several minutes later attempted to stop that workspace, termination also resulted in a 'spinning' progress indicator.
  4. Left the system in that state overnight.
  5. 'Spinning' progress was still happening.
  6. Tested agent via CI pipeline - agent was successfully working in CI context.
  7. Recreated agent (unregistered via UI and removed helm installation and reinstalled/registered it) - config.yaml was left untouched (still containing remote_development)
  8. Workspace creation was successful

What is the current bug behavior?

Workspace creation (and subsequent deletion) was unsuccessful

What is the expected correct behavior?

Workspace creation successful

Relevant logs and/or screenshots

Logs from agent
❯ kubectl logs pod/rdev-gitlab-agent-6ff8749b5f-75l24                                    
{"level":"info","time":"2023-06-01T16:16:53.382Z","msg":"Observability endpoint is up","mod_name":"observability","net_network":"tcp","net_address":"[::]:8080"}
{"level":"info","time":"2023-06-01T16:16:55.858Z","msg":"attempting to acquire leader lease gitlab-agent-rdev/agent-61679-lock...\n","agent_id":61679}
{"level":"info","time":"2023-06-01T16:16:55.877Z","msg":"successfully acquired lease gitlab-agent-rdev/agent-61679-lock\n","agent_id":61679}
{"level":"info","time":"2023-06-01T16:16:55.877Z","msg":"Event occurred","object":{"name":"agent-61679-lock","namespace":"gitlab-agent-rdev"},"fieldPath":"","kind":"Lease","apiVersion":"coordination.k8s.io/v1","type":"Normal","reason":"LeaderElection","message":"rdev-gitlab-agent-6ff8749b5f-75l24 became leader","agent_id":61679}
{"level":"info","time":"2023-06-01T16:21:28.185Z","msg":"starting full sync","mod_name":"remote_development","agent_id":61679}
{"level":"info","time":"2023-06-01T16:21:38.186Z","msg":"starting partial update","mod_name":"remote_development","agent_id":61679}
{"level":"info","time":"2023-06-01T16:21:52.898Z","msg":"starting partial update","mod_name":"remote_development","agent_id":61679}
{"level":"info","time":"2023-06-01T16:22:05.715Z","msg":"starting partial update","mod_name":"remote_development","agent_id":61679}
{"level":"info","time":"2023-06-01T16:22:17.302Z","msg":"starting partial update","mod_name":"remote_development","agent_id":61679}
{"level":"info","time":"2023-06-01T16:22:29.702Z","msg":"starting partial update","mod_name":"remote_development","agent_id":61679}
{"level":"info","time":"2023-06-01T16:22:42.963Z","msg":"starting partial update","mod_name":"remote_development","agent_id":61679}
{"level":"info","time":"2023-06-01T16:22:54.500Z","msg":"starting partial update","mod_name":"remote_development","agent_id":61679}
{"level":"info","time":"2023-06-01T16:23:07.914Z","msg":"starting partial update","mod_name":"remote_development","agent_id":61679}
{"level":"info","time":"2023-06-01T16:23:19.790Z","msg":"starting partial update","mod_name":"remote_development","agent_id":61679}
{"level":"info","time":"2023-06-01T16:23:32.304Z","msg":"starting partial update","mod_name":"remote_development","agent_id":61679}
{"level":"info","time":"2023-06-01T16:23:44.708Z","msg":"starting partial update","mod_name":"remote_development","agent_id":61679}
{"level":"info","time":"2023-06-01T16:23:58.397Z","msg":"starting partial update","mod_name":"remote_development","agent_id":61679}
{"level":"info","time":"2023-06-01T16:24:14.911Z","msg":"starting partial update","mod_name":"remote_development","agent_id":61679}
{"level":"info","time":"2023-06-01T16:24:35.058Z","msg":"starting partial update","mod_name":"remote_development","agent_id":61679}
{"level":"info","time":"2023-06-01T16:24:55.523Z","msg":"starting partial update","mod_name":"remote_development","agent_id":61679}
{"level":"info","time":"2023-06-01T16:25:19.815Z","msg":"starting partial update","mod_name":"remote_development","agent_id":61679}
{"level":"info","time":"2023-06-01T16:25:32.415Z","msg":"starting partial update","mod_name":"remote_development","agent_id":61679}
{"level":"info","time":"2023-06-01T16:25:44.593Z","msg":"starting partial update","mod_name":"remote_development","agent_id":61679}
{"level":"info","time":"2023-06-01T16:25:58.290Z","msg":"starting partial update","mod_name":"remote_development","agent_id":61679}
{"level":"info","time":"2023-06-01T16:26:10.110Z","msg":"starting partial update","mod_name":"remote_development","agent_id":61679}
{"level":"info","time":"2023-06-01T16:26:21.104Z","msg":"starting partial update","mod_name":"remote_development","agent_id":61679}
{"level":"info","time":"2023-06-01T16:26:33.685Z","msg":"starting partial update","mod_name":"remote_development","agent_id":61679}
{"level":"info","time":"2023-06-01T16:26:45.601Z","msg":"starting partial update","mod_name":"remote_development","agent_id":61679}
{"level":"info","time":"2023-06-01T16:26:58.207Z","msg":"starting partial update","mod_name":"remote_development","agent_id":61679}
{"level":"info","time":"2023-06-01T16:27:10.522Z","msg":"starting partial update","mod_name":"remote_development","agent_id":61679}
{"level":"info","time":"2023-06-01T16:27:22.007Z","msg":"starting partial update","mod_name":"remote_development","agent_id":61679}
{"level":"error","time":"2023-06-01T16:27:25.530Z","msg":"error retrieving resource lock gitlab-agent-rdev/agent-61679-lock: Get \"https://10.167.240.1:443/apis/coordination.k8s.io/v1/namespaces/gitlab-agent-rdev/leases/agent-61679-lock\": context deadline exceeded\n","agent_id":61679}
{"level":"info","time":"2023-06-01T16:27:25.530Z","msg":"failed to renew lease gitlab-agent-rdev/agent-61679-lock: timed out waiting for the condition\n","agent_id":61679}
{"level":"info","time":"2023-06-01T16:27:25.530Z","msg":"Event occurred","object":{"name":"agent-61679-lock","namespace":"gitlab-agent-rdev"},"fieldPath":"","kind":"Lease","apiVersion":"coordination.k8s.io/v1","type":"Normal","reason":"LeaderElection","message":"rdev-gitlab-agent-6ff8749b5f-75l24 stopped leading","agent_id":61679}
{"level":"info","time":"2023-06-01T16:27:34.406Z","msg":"starting partial update","mod_name":"remote_development","agent_id":61679}
{"level":"info","time":"2023-06-01T16:27:46.305Z","msg":"starting partial update","mod_name":"remote_development","agent_id":61679}
{"level":"error","time":"2023-06-01T16:27:55.530Z","msg":"Failed to release lock: Put \"https://10.167.240.1:443/apis/coordination.k8s.io/v1/namespaces/gitlab-agent-rdev/leases/agent-61679-lock\": dial tcp 10.167.240.1:443: i/o timeout\n","agent_id":61679}
{"level":"error","time":"2023-06-01T16:27:55.530Z","msg":"Unable to write event: '&v1.Event{TypeMeta:v1.TypeMeta{Kind:\"\", APIVersion:\"\"}, ObjectMeta:v1.ObjectMeta{Name:\"agent-61679-lock.176495c7b623a77d\", GenerateName:\"\", Namespace:\"gitlab-agent-rdev\", SelfLink:\"\", UID:\"\", ResourceVersion:\"\", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:\"Lease\", Namespace:\"gitlab-agent-rdev\", Name:\"agent-61679-lock\", UID:\"3435c2fd-07b5-4f10-9cab-db976dbf882c\", APIVersion:\"coordination.k8s.io/v1\", ResourceVersion:\"175901\", FieldPath:\"\"}, Reason:\"LeaderElection\", Message:\"rdev-gitlab-agent-6ff8749b5f-75l24 stopped leading\", Source:v1.EventSource{Component:\"gitlab-agent\", Host:\"\"}, FirstTimestamp:time.Date(2023, time.June, 1, 16, 27, 25, 530228605, time.Local), LastTimestamp:time.Date(2023, time.June, 1, 16, 27, 25, 530228605, time.Local), Count:1, Type:\"Normal\", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:\"\", Related:(*v1.ObjectReference)(nil), ReportingController:\"\", ReportingInstance:\"\"}': 'Post \"https://10.167.240.1:443/api/v1/namespaces/gitlab-agent-rdev/events\": dial tcp 10.167.240.1:443: i/o timeout'(may retry after sleeping)\n","agent_id":61679}
{"level":"error","time":"2023-06-01T16:27:55.531Z","msg":"Error checking security policies","mod_name":"starboard_vulnerability","error":"could not retrieve security policies: rpc error: code = Canceled desc = context canceled","agent_id":61679}
{"level":"info","time":"2023-06-01T16:27:55.531Z","msg":"informer stopped","mod_name":"remote_development","agent_id":61679}
{"level":"error","time":"2023-06-01T16:28:27.328Z","msg":"Unable to write event: '&v1.Event{TypeMeta:v1.TypeMeta{Kind:\"\", APIVersion:\"\"}, ObjectMeta:v1.ObjectMeta{Name:\"agent-61679-lock.176495c7b623a77d\", GenerateName:\"\", Namespace:\"gitlab-agent-rdev\", SelfLink:\"\", UID:\"\", ResourceVersion:\"\", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:\"Lease\", Namespace:\"gitlab-agent-rdev\", Name:\"agent-61679-lock\", UID:\"3435c2fd-07b5-4f10-9cab-db976dbf882c\", APIVersion:\"coordination.k8s.io/v1\", ResourceVersion:\"175901\", FieldPath:\"\"}, Reason:\"LeaderElection\", Message:\"rdev-gitlab-agent-6ff8749b5f-75l24 stopped leading\", Source:v1.EventSource{Component:\"gitlab-agent\", Host:\"\"}, FirstTimestamp:time.Date(2023, time.June, 1, 16, 27, 25, 530228605, time.Local), LastTimestamp:time.Date(2023, time.June, 1, 16, 27, 25, 530228605, time.Local), Count:1, Type:\"Normal\", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:\"\", Related:(*v1.ObjectReference)(nil), ReportingController:\"\", ReportingInstance:\"\"}': 'Post \"https://10.167.240.1:443/api/v1/namespaces/gitlab-agent-rdev/events\": dial tcp 10.167.240.1:443: i/o timeout'(may retry after sleeping)\n","agent_id":61679}
{"level":"error","time":"2023-06-01T16:29:07.345Z","msg":"Unable to write event: '&v1.Event{TypeMeta:v1.TypeMeta{Kind:\"\", APIVersion:\"\"}, ObjectMeta:v1.ObjectMeta{Name:\"agent-61679-lock.176495c7b623a77d\", GenerateName:\"\", Namespace:\"gitlab-agent-rdev\", SelfLink:\"\", UID:\"\", ResourceVersion:\"\", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:\"Lease\", Namespace:\"gitlab-agent-rdev\", Name:\"agent-61679-lock\", UID:\"3435c2fd-07b5-4f10-9cab-db976dbf882c\", APIVersion:\"coordination.k8s.io/v1\", ResourceVersion:\"175901\", FieldPath:\"\"}, Reason:\"LeaderElection\", Message:\"rdev-gitlab-agent-6ff8749b5f-75l24 stopped leading\", Source:v1.EventSource{Component:\"gitlab-agent\", Host:\"\"}, FirstTimestamp:time.Date(2023, time.June, 1, 16, 27, 25, 530228605, time.Local), LastTimestamp:time.Date(2023, time.June, 1, 16, 27, 25, 530228605, time.Local), Count:1, Type:\"Normal\", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:\"\", Related:(*v1.ObjectReference)(nil), ReportingController:\"\", ReportingInstance:\"\"}': 'Post \"https://10.167.240.1:443/api/v1/namespaces/gitlab-agent-rdev/events\": dial tcp 10.167.240.1:443: i/o timeout'(may retry after sleeping)\n","agent_id":61679}
{"level":"error","time":"2023-06-01T16:29:47.354Z","msg":"Unable to write event: '&v1.Event{TypeMeta:v1.TypeMeta{Kind:\"\", APIVersion:\"\"}, ObjectMeta:v1.ObjectMeta{Name:\"agent-61679-lock.176495c7b623a77d\", GenerateName:\"\", Namespace:\"gitlab-agent-rdev\", SelfLink:\"\", UID:\"\", ResourceVersion:\"\", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:\"Lease\", Namespace:\"gitlab-agent-rdev\", Name:\"agent-61679-lock\", UID:\"3435c2fd-07b5-4f10-9cab-db976dbf882c\", APIVersion:\"coordination.k8s.io/v1\", ResourceVersion:\"175901\", FieldPath:\"\"}, Reason:\"LeaderElection\", Message:\"rdev-gitlab-agent-6ff8749b5f-75l24 stopped leading\", Source:v1.EventSource{Component:\"gitlab-agent\", Host:\"\"}, FirstTimestamp:time.Date(2023, time.June, 1, 16, 27, 25, 530228605, time.Local), LastTimestamp:time.Date(2023, time.June, 1, 16, 27, 25, 530228605, time.Local), Count:1, Type:\"Normal\", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:\"\", Related:(*v1.ObjectReference)(nil), ReportingController:\"\", ReportingInstance:\"\"}': 'Post \"https://10.167.240.1:443/api/v1/namespaces/gitlab-agent-rdev/events\": dial tcp 10.167.240.1:443: i/o timeout'(may retry after sleeping)\n","agent_id":61679}
{"level":"error","time":"2023-06-01T16:30:27.360Z","msg":"Unable to write event: '&v1.Event{TypeMeta:v1.TypeMeta{Kind:\"\", APIVersion:\"\"}, ObjectMeta:v1.ObjectMeta{Name:\"agent-61679-lock.176495c7b623a77d\", GenerateName:\"\", Namespace:\"gitlab-agent-rdev\", SelfLink:\"\", UID:\"\", ResourceVersion:\"\", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:\"Lease\", Namespace:\"gitlab-agent-rdev\", Name:\"agent-61679-lock\", UID:\"3435c2fd-07b5-4f10-9cab-db976dbf882c\", APIVersion:\"coordination.k8s.io/v1\", ResourceVersion:\"175901\", FieldPath:\"\"}, Reason:\"LeaderElection\", Message:\"rdev-gitlab-agent-6ff8749b5f-75l24 stopped leading\", Source:v1.EventSource{Component:\"gitlab-agent\", Host:\"\"}, FirstTimestamp:time.Date(2023, time.June, 1, 16, 27, 25, 530228605, time.Local), LastTimestamp:time.Date(2023, time.June, 1, 16, 27, 25, 530228605, time.Local), Count:1, Type:\"Normal\", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:\"\", Related:(*v1.ObjectReference)(nil), ReportingController:\"\", ReportingInstance:\"\"}': 'Post \"https://10.167.240.1:443/api/v1/namespaces/gitlab-agent-rdev/events\": dial tcp 10.167.240.1:443: i/o timeout'(may retry after sleeping)\n","agent_id":61679}
{"level":"error","time":"2023-06-01T16:31:07.367Z","msg":"Unable to write event: '&v1.Event{TypeMeta:v1.TypeMeta{Kind:\"\", APIVersion:\"\"}, ObjectMeta:v1.ObjectMeta{Name:\"agent-61679-lock.176495c7b623a77d\", GenerateName:\"\", Namespace:\"gitlab-agent-rdev\", SelfLink:\"\", UID:\"\", ResourceVersion:\"\", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:\"Lease\", Namespace:\"gitlab-agent-rdev\", Name:\"agent-61679-lock\", UID:\"3435c2fd-07b5-4f10-9cab-db976dbf882c\", APIVersion:\"coordination.k8s.io/v1\", ResourceVersion:\"175901\", FieldPath:\"\"}, Reason:\"LeaderElection\", Message:\"rdev-gitlab-agent-6ff8749b5f-75l24 stopped leading\", Source:v1.EventSource{Component:\"gitlab-agent\", Host:\"\"}, FirstTimestamp:time.Date(2023, time.June, 1, 16, 27, 25, 530228605, time.Local), LastTimestamp:time.Date(2023, time.June, 1, 16, 27, 25, 530228605, time.Local), Count:1, Type:\"Normal\", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:\"\", Related:(*v1.ObjectReference)(nil), ReportingController:\"\", ReportingInstance:\"\"}': 'Post \"https://10.167.240.1:443/api/v1/namespaces/gitlab-agent-rdev/events\": dial tcp 10.167.240.1:443: i/o timeout'(may retry after sleeping)\n","agent_id":61679}
{"level":"error","time":"2023-06-01T16:31:47.369Z","msg":"Unable to write event: '&v1.Event{TypeMeta:v1.TypeMeta{Kind:\"\", APIVersion:\"\"}, ObjectMeta:v1.ObjectMeta{Name:\"agent-61679-lock.176495c7b623a77d\", GenerateName:\"\", Namespace:\"gitlab-agent-rdev\", SelfLink:\"\", UID:\"\", ResourceVersion:\"\", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:\"Lease\", Namespace:\"gitlab-agent-rdev\", Name:\"agent-61679-lock\", UID:\"3435c2fd-07b5-4f10-9cab-db976dbf882c\", APIVersion:\"coordination.k8s.io/v1\", ResourceVersion:\"175901\", FieldPath:\"\"}, Reason:\"LeaderElection\", Message:\"rdev-gitlab-agent-6ff8749b5f-75l24 stopped leading\", Source:v1.EventSource{Component:\"gitlab-agent\", Host:\"\"}, FirstTimestamp:time.Date(2023, time.June, 1, 16, 27, 25, 530228605, time.Local), LastTimestamp:time.Date(2023, time.June, 1, 16, 27, 25, 530228605, time.Local), Count:1, Type:\"Normal\", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:\"\", Related:(*v1.ObjectReference)(nil), ReportingController:\"\", ReportingInstance:\"\"}': 'Post \"https://10.167.240.1:443/api/v1/namespaces/gitlab-agent-rdev/events\": dial tcp 10.167.240.1:443: i/o timeout'(may retry after sleeping)\n","agent_id":61679}
{"level":"error","time":"2023-06-01T16:32:27.374Z","msg":"Unable to write event: '&v1.Event{TypeMeta:v1.TypeMeta{Kind:\"\", APIVersion:\"\"}, ObjectMeta:v1.ObjectMeta{Name:\"agent-61679-lock.176495c7b623a77d\", GenerateName:\"\", Namespace:\"gitlab-agent-rdev\", SelfLink:\"\", UID:\"\", ResourceVersion:\"\", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:\"Lease\", Namespace:\"gitlab-agent-rdev\", Name:\"agent-61679-lock\", UID:\"3435c2fd-07b5-4f10-9cab-db976dbf882c\", APIVersion:\"coordination.k8s.io/v1\", ResourceVersion:\"175901\", FieldPath:\"\"}, Reason:\"LeaderElection\", Message:\"rdev-gitlab-agent-6ff8749b5f-75l24 stopped leading\", Source:v1.EventSource{Component:\"gitlab-agent\", Host:\"\"}, FirstTimestamp:time.Date(2023, time.June, 1, 16, 27, 25, 530228605, time.Local), LastTimestamp:time.Date(2023, time.June, 1, 16, 27, 25, 530228605, time.Local), Count:1, Type:\"Normal\", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:\"\", Related:(*v1.ObjectReference)(nil), ReportingController:\"\", ReportingInstance:\"\"}': 'Post \"https://10.167.240.1:443/api/v1/namespaces/gitlab-agent-rdev/events\": dial tcp 10.167.240.1:443: i/o timeout'(may retry after sleeping)\n","agent_id":61679}
gitlab-workspaces-proxy log ``` ❯ kubectl logs pod/gitlab-workspaces-proxy-94f8dc8fd-vzrsn -n gitlab-workspaces W0601 14:54:30.891550 1 client_config.go:618] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work. {"level":"info","ts":1685631270.994384,"caller":"server/server.go:74","msg":"Starting proxy server..."} 2023/06/01 14:55:10 getHostnameFromState state=https://workspace.sting-ray.za.net/ 2023/06/01 14:55:10 getHostnameFromState u.Hostname()=workspace.sting-ray.za.net {"level":"error","ts":1685631310.7642138,"caller":"auth/middleware.go:140","msg":"error processing request","error":"could not find upstream workspace upstream not found","stacktrace":"gitlab.com/remote-development/gitlab-workspaces-proxy/pkg/auth.errorResponse\n\t/app/pkg/auth/middleware.go:140\ngitlab.com/remote-development/gitlab-workspaces-proxy/pkg/auth.NewMiddleware.func1.1\n\t/app/pkg/auth/middleware.go:49\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2109\ngitlab.com/remote-development/gitlab-workspaces-proxy/pkg/logging.NewMiddleware.func1.1\n\t/app/pkg/logging/middleware.go:13\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2109\nnet/http.(*ServeMux).ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2487\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2947\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1991"} {"level":"info","ts":1685631310.7642946,"caller":"logging/middleware.go:15","msg":"HTTP request processed","path":"/","ip":"10.164.1.15:39188","status":400,"host":"workspace.sting-ray.za.net","method":"GET"} 2023/06/01 15:03:04 getHostnameFromState state=https://workspace.sting-ray.za.net/ 2023/06/01 15:03:04 getHostnameFromState u.Hostname()=workspace.sting-ray.za.net {"level":"error","ts":1685631784.2443018,"caller":"auth/middleware.go:140","msg":"error processing request","error":"could not find upstream workspace upstream not found","stacktrace":"gitlab.com/remote-development/gitlab-workspaces-proxy/pkg/auth.errorResponse\n\t/app/pkg/auth/middleware.go:140\ngitlab.com/remote-development/gitlab-workspaces-proxy/pkg/auth.NewMiddleware.func1.1\n\t/app/pkg/auth/middleware.go:49\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2109\ngitlab.com/remote-development/gitlab-workspaces-proxy/pkg/logging.NewMiddleware.func1.1\n\t/app/pkg/logging/middleware.go:13\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2109\nnet/http.(*ServeMux).ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2487\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2947\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1991"} {"level":"info","ts":1685631784.2443583,"caller":"logging/middleware.go:15","msg":"HTTP request processed","path":"/","ip":"10.164.1.15:48394","status":400,"host":"workspace.sting-ray.za.net","method":"GET"} {"level":"error","ts":1685631801.3317707,"caller":"auth/middleware.go:140","msg":"error processing request","error":"could not find auth code","stacktrace":"gitlab.com/remote-development/gitlab-workspaces-proxy/pkg/auth.errorResponse\n\t/app/pkg/auth/middleware.go:140\ngitlab.com/remote-development/gitlab-workspaces-proxy/pkg/auth.handleRedirect\n\t/app/pkg/auth/middleware.go:114\ngitlab.com/remote-development/gitlab-workspaces-proxy/pkg/auth.NewMiddleware.func1.1\n\t/app/pkg/auth/middleware.go:37\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2109\ngitlab.com/remote-development/gitlab-workspaces-proxy/pkg/logging.NewMiddleware.func1.1\n\t/app/pkg/logging/middleware.go:13\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2109\nnet/http.(*ServeMux).ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2487\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2947\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1991"} {"level":"info","ts":1685631801.3318355,"caller":"logging/middleware.go:15","msg":"HTTP request processed","path":"/auth/callback","ip":"10.164.1.15:47814","status":400,"host":"workspace.sting-ray.za.net","method":"GET"} 2023/06/01 15:03:50 getHostnameFromState state=https://workspace.sting-ray.za.net/ 2023/06/01 15:03:50 getHostnameFromState u.Hostname()=workspace.sting-ray.za.net {"level":"error","ts":1685631830.821113,"caller":"auth/middleware.go:140","msg":"error processing request","error":"could not find upstream workspace upstream not found","stacktrace":"gitlab.com/remote-development/gitlab-workspaces-proxy/pkg/auth.errorResponse\n\t/app/pkg/auth/middleware.go:140\ngitlab.com/remote-development/gitlab-workspaces-proxy/pkg/auth.NewMiddleware.func1.1\n\t/app/pkg/auth/middleware.go:49\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2109\ngitlab.com/remote-development/gitlab-workspaces-proxy/pkg/logging.NewMiddleware.func1.1\n\t/app/pkg/logging/middleware.go:13\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2109\nnet/http.(*ServeMux).ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2487\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2947\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1991"} {"level":"info","ts":1685631830.8211925,"caller":"logging/middleware.go:15","msg":"HTTP request processed","path":"/","ip":"10.164.1.15:48394","status":400,"host":"workspace.sting-ray.za.net","method":"GET"} 2023/06/01 15:08:36 getHostnameFromState state=https://workspace.sting-ray.za.net/ 2023/06/01 15:08:36 getHostnameFromState u.Hostname()=workspace.sting-ray.za.net {"level":"error","ts":1685632116.1037667,"caller":"auth/middleware.go:140","msg":"error processing request","error":"could not find upstream workspace upstream not found","stacktrace":"gitlab.com/remote-development/gitlab-workspaces-proxy/pkg/auth.errorResponse\n\t/app/pkg/auth/middleware.go:140\ngitlab.com/remote-development/gitlab-workspaces-proxy/pkg/auth.NewMiddleware.func1.1\n\t/app/pkg/auth/middleware.go:49\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2109\ngitlab.com/remote-development/gitlab-workspaces-proxy/pkg/logging.NewMiddleware.func1.1\n\t/app/pkg/logging/middleware.go:13\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2109\nnet/http.(*ServeMux).ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2487\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2947\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1991"} {"level":"info","ts":1685632116.1038263,"caller":"logging/middleware.go:15","msg":"HTTP request processed","path":"/","ip":"10.164.1.15:58686","status":400,"host":"workspace.sting-ray.za.net","method":"GET"} 2023/06/01 15:08:36 getHostnameFromState state=https://workspace.sting-ray.za.net/ 2023/06/01 15:08:36 getHostnameFromState u.Hostname()=workspace.sting-ray.za.net {"level":"error","ts":1685632116.2182739,"caller":"auth/middleware.go:140","msg":"error processing request","error":"could not find upstream workspace upstream not found","stacktrace":"gitlab.com/remote-development/gitlab-workspaces-proxy/pkg/auth.errorResponse\n\t/app/pkg/auth/middleware.go:140\ngitlab.com/remote-development/gitlab-workspaces-proxy/pkg/auth.NewMiddleware.func1.1\n\t/app/pkg/auth/middleware.go:49\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2109\ngitlab.com/remote-development/gitlab-workspaces-proxy/pkg/logging.NewMiddleware.func1.1\n\t/app/pkg/logging/middleware.go:13\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2109\nnet/http.(*ServeMux).ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2487\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2947\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1991"} {"level":"info","ts":1685632116.218344,"caller":"logging/middleware.go:15","msg":"HTTP request processed","path":"/","ip":"10.164.1.15:58696","status":400,"host":"workspace.sting-ray.za.net","method":"GET"} W0601 16:27:44.467423 1 reflector.go:347] pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:169: watch of *v1.Service ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding W0601 16:28:15.573290 1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:169: failed to list *v1.Service: Get "https://10.167.240.1:443/api/v1/services?labelSelector=agent.gitlab.com%2Fid&resourceVersion=175744": dial tcp 10.167.240.1:443: i/o timeout I0601 16:28:15.573388 1 trace.go:219] Trace[683024728]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:169 (01-Jun-2023 16:27:45.572) (total time: 30000ms): Trace[683024728]: ---"Objects listed" error:Get "https://10.167.240.1:443/api/v1/services?labelSelector=agent.gitlab.com%2Fid&resourceVersion=175744": dial tcp 10.167.240.1:443: i/o timeout 30000ms (16:28:15.573) Trace[683024728]: [30.000680639s] [30.000680639s] END E0601 16:28:15.573411 1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:169: Failed to watch *v1.Service: failed to list *v1.Service: Get "https://10.167.240.1:443/api/v1/services?labelSelector=agent.gitlab.com%2Fid&resourceVersion=175744": dial tcp 10.167.240.1:443: i/o timeout W0601 16:28:47.927943 1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:169: failed to list *v1.Service: Get "https://10.167.240.1:443/api/v1/services?labelSelector=agent.gitlab.com%2Fid&resourceVersion=175744": dial tcp 10.167.240.1:443: i/o timeout I0601 16:28:47.928090 1 trace.go:219] Trace[607811211]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:169 (01-Jun-2023 16:28:17.925) (total time: 30002ms): Trace[607811211]: ---"Objects listed" error:Get "https://10.167.240.1:443/api/v1/services?labelSelector=agent.gitlab.com%2Fid&resourceVersion=175744": dial tcp 10.167.240.1:443: i/o timeout 30002ms (16:28:47.927) Trace[607811211]: [30.00249972s] [30.00249972s] END E0601 16:28:47.928113 1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:169: Failed to watch *v1.Service: failed to list *v1.Service: Get "https://10.167.240.1:443/api/v1/services?labelSelector=agent.gitlab.com%2Fid&resourceVersion=175744": dial tcp 10.167.240.1:443: i/o timeout W0601 16:29:22.076208 1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:169: failed to list *v1.Service: Get "https://10.167.240.1:443/api/v1/services?labelSelector=agent.gitlab.com%2Fid&resourceVersion=175744": dial tcp 10.167.240.1:443: i/o timeout I0601 16:29:22.076288 1 trace.go:219] Trace[1458323237]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:169 (01-Jun-2023 16:28:52.069) (total time: 30006ms): Trace[1458323237]: ---"Objects listed" error:Get "https://10.167.240.1:443/api/v1/services?labelSelector=agent.gitlab.com%2Fid&resourceVersion=175744": dial tcp 10.167.240.1:443: i/o timeout 30006ms (16:29:22.076) Trace[1458323237]: [30.006874603s] [30.006874603s] END E0601 16:29:22.076303 1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:169: Failed to watch *v1.Service: failed to list *v1.Service: Get "https://10.167.240.1:443/api/v1/services?labelSelector=agent.gitlab.com%2Fid&resourceVersion=175744": dial tcp 10.167.240.1:443: i/o timeout W0601 16:29:59.881794 1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:169: failed to list *v1.Service: Get "https://10.167.240.1:443/api/v1/services?labelSelector=agent.gitlab.com%2Fid&resourceVersion=175744": dial tcp 10.167.240.1:443: i/o timeout I0601 16:29:59.881875 1 trace.go:219] Trace[436340495]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:169 (01-Jun-2023 16:29:29.877) (total time: 30004ms): Trace[436340495]: ---"Objects listed" error:Get "https://10.167.240.1:443/api/v1/services?labelSelector=agent.gitlab.com%2Fid&resourceVersion=175744": dial tcp 10.167.240.1:443: i/o timeout 30004ms (16:29:59.881) Trace[436340495]: [30.004718506s] [30.004718506s] END E0601 16:29:59.881890 1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:169: Failed to watch *v1.Service: failed to list *v1.Service: Get "https://10.167.240.1:443/api/v1/services?labelSelector=agent.gitlab.com%2Fid&resourceVersion=175744": dial tcp 10.167.240.1:443: i/o timeout W0601 16:30:47.303125 1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:169: failed to list *v1.Service: Get "https://10.167.240.1:443/api/v1/services?labelSelector=agent.gitlab.com%2Fid&resourceVersion=175744": dial tcp 10.167.240.1:443: i/o timeout I0601 16:30:47.303214 1 trace.go:219] Trace[1225511528]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:169 (01-Jun-2023 16:30:17.302) (total time: 30000ms): Trace[1225511528]: ---"Objects listed" error:Get "https://10.167.240.1:443/api/v1/services?labelSelector=agent.gitlab.com%2Fid&resourceVersion=175744": dial tcp 10.167.240.1:443: i/o timeout 30000ms (16:30:47.303) Trace[1225511528]: [30.000949372s] [30.000949372s] END E0601 16:30:47.303235 1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:169: Failed to watch *v1.Service: failed to list *v1.Service: Get "https://10.167.240.1:443/api/v1/services?labelSelector=agent.gitlab.com%2Fid&resourceVersion=175744": dial tcp 10.167.240.1:443: i/o timeout W0601 16:32:04.992789 1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:169: failed to list *v1.Service: Get "https://10.167.240.1:443/api/v1/services?labelSelector=agent.gitlab.com%2Fid&resourceVersion=175744": dial tcp 10.167.240.1:443: i/o timeout I0601 16:32:04.992895 1 trace.go:219] Trace[629458047]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:169 (01-Jun-2023 16:31:34.992) (total time: 30000ms): Trace[629458047]: ---"Objects listed" error:Get "https://10.167.240.1:443/api/v1/services?labelSelector=agent.gitlab.com%2Fid&resourceVersion=175744": dial tcp 10.167.240.1:443: i/o timeout 30000ms (16:32:04.992) Trace[629458047]: [30.000767186s] [30.000767186s] END E0601 16:32:04.992925 1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:169: Failed to watch *v1.Service: failed to list *v1.Service: Get "https://10.167.240.1:443/api/v1/services?labelSelector=agent.gitlab.com%2Fid&resourceVersion=175744": dial tcp 10.167.240.1:443: i/o timeout I0601 16:32:59.251294 1 trace.go:219] Trace[1616138287]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:169 (01-Jun-2023 16:32:43.913) (total time: 15338ms): Trace[1616138287]: ---"Objects listed" error: 15337ms (16:32:59.251) Trace[1616138287]: [15.338014344s] [15.338014344s] END ```
Cluster config Cluster: GKE Nodes: currently 1, autoscaling enabled to 3

Cluster provisioned by terraform:

# GKE cluster
resource "google_container_cluster" "primary" {
  name     = "${var.project_id}-gke"
  location = local.region-zone
  
  # We can't create a cluster with no node pool defined, but we want to only use
  # separately managed node pools. So we create the smallest possible default
  # node pool and immediately delete it.
  remove_default_node_pool = true
  initial_node_count       = 1

  network    = google_compute_network.vpc.name
  subnetwork = google_compute_subnetwork.subnet.name
}

# Separately Managed Node Pool
resource "google_container_node_pool" "primary_nodes" {
  name       = google_container_cluster.primary.name
  location   = local.region-zone
  cluster    = google_container_cluster.primary.name
  initial_node_count = var.gke_num_nodes

  
  node_config {
    oauth_scopes = [
      "https://www.googleapis.com/auth/logging.write",
      "https://www.googleapis.com/auth/monitoring",
    ]

    labels = {
      env = var.project_id
    }

    # preemptible  = true
    machine_type = "e2-standard-8"
    tags         = ["gke-node", "${var.project_id}-gke"]
    metadata = {
      disable-legacy-endpoints = "true"
    }
  }
  
  autoscaling {
    min_node_count = 1
    max_node_count = 3
  }
}
Other cluster details
❯ kubectl get all -A
NAMESPACE                       NAME                                                                 READY   STATUS    RESTARTS   AGE
gitlab-agent-rdev               pod/rdev-gitlab-agent-6ff8749b5f-6c5mh                               1/1     Running   0          82m
gitlab-workspaces               pod/gitlab-workspaces-proxy-94f8dc8fd-vzrsn                          1/1     Running   0          18h
gl-rd-ns-61728-2083197-ilrb7y   pod/workspace-61728-2083197-ilrb7y-86965748db-xd9r9                  1/1     Running   0          78m
ingress-nginx                   pod/nginx-ingress-ingress-nginx-controller-867dc6b6c5-q44t4          1/1     Running   0          21h
kube-system                     pod/event-exporter-gke-755c4b4d97-4ctjh                              2/2     Running   0          22h
kube-system                     pod/fluentbit-gke-vwr6w                                              2/2     Running   0          22h
kube-system                     pod/gke-metrics-agent-mfsbq                                          2/2     Running   0          22h
kube-system                     pod/konnectivity-agent-65c88cbd8d-lmp95                              1/1     Running   0          22h
kube-system                     pod/konnectivity-agent-autoscaler-7dc78c8c9-txn4s                    1/1     Running   0          22h
kube-system                     pod/kube-dns-5b5dfcd97b-c6m47                                        4/4     Running   0          22h
kube-system                     pod/kube-dns-autoscaler-5f56f8997c-5mpds                             1/1     Running   0          22h
kube-system                     pod/kube-proxy-gke-rhook-068b113c-g-rhook-068b113c-g-15da185b-r80g   1/1     Running   0          22h
kube-system                     pod/l7-default-backend-8cdcff48c-88ws5                               1/1     Running   0          22h
kube-system                     pod/metrics-server-v0.5.2-67864775dc-zrv65                           2/2     Running   0          22h
kube-system                     pod/pdcsi-node-8bsst                                                 2/2     Running   0          22h

NAMESPACE                       NAME                                                       TYPE           CLUSTER-IP       EXTERNAL-IP    PORT(S)                      AGE
default                         service/kubernetes                                         ClusterIP      10.167.240.1     <none>         443/TCP                      22h
gitlab-workspaces               service/gitlab-workspaces-proxy                            ClusterIP      10.167.246.127   <none>         80/TCP                       18h
gl-rd-ns-61728-2083197-ilrb7y   service/workspace-61728-2083197-ilrb7y                     ClusterIP      10.167.252.144   <none>         3000/TCP,60001/TCP           78m
ingress-nginx                   service/nginx-ingress-ingress-nginx-controller             LoadBalancer   10.167.241.236   <redacted>   80:30434/TCP,443:31123/TCP   21h
ingress-nginx                   service/nginx-ingress-ingress-nginx-controller-admission   ClusterIP      10.167.245.204   <none>         443/TCP                      21h
kube-system                     service/default-http-backend                               NodePort       10.167.246.78    <none>         80:30179/TCP                 22h
kube-system                     service/kube-dns                                           ClusterIP      10.167.240.10    <none>         53/UDP,53/TCP                22h
kube-system                     service/metrics-server                                     ClusterIP      10.167.250.223   <none>         443/TCP                      22h

Output of checks

This bug happens on GitLab.com

Edited by Raimund Hook