Deploying Prometheus within the Auto DevOps cluster fails
Summary
Deploying Prometheus fails within the created Auto DevOps cluster with the minimal-ruby-app
example on Google Cloud.
Steps to reproduce
I followed the following documentations:
- https://gitlab.com/help/topics/autodevops/index.md
- https://gitlab.com/help/topics/autodevops/quick_start_guide.md
- https://docs.gitlab.com/ce/user/project/integrations/prometheus.html
So what I did:
- Created a fork of https://gitlab.com/auto-devops-examples/minimal-ruby-app to https://gitlab.com/mark-veenstra/minimal-ruby-app
- Added a own cluster, see: https://gitlab.com/mark-veenstra/minimal-ruby-app/clusters/280
- Exposed it to the world
- Setup Auto DevOps
-
Installing Prometheus within the Kubernetes using the sample YML file provided within that documentation. I did use the
kubectl
command. - I can't get further than this point for adding the firewall rules, this is I guess because the Prometheus is not starting correctly.
Example Project
https://gitlab.com/mark-veenstra/minimal-ruby-app
What is the current bug behavior?
I can see the following output on command kubectl describe pods -n prometheus
:
mva@mva-laptop:~$ kubectl describe pods -n prometheus
Name: prometheus-3696020256-hm3sr
Namespace: prometheus
Node: gke-cluster-auto-devops--default-pool-47e3d392-fkvv/10.132.0.3
Start Time: Wed, 15 Nov 2017 08:39:26 +0100
Labels: app=prometheus
pod-template-hash=3696020256
Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"prometheus","name":"prometheus-3696020256","uid":"1b55d2de-c9d8-11e7-9673-42010a8...
Status: Running
IP: 10.20.1.9
Created By: ReplicaSet/prometheus-3696020256
Controlled By: ReplicaSet/prometheus-3696020256
Containers:
prometheus:
Container ID: docker://ea5a68df0c481b661936791a5425e80f7229cc1ddce3bf2d15f22a7f79407451
Image: prom/prometheus:latest
Image ID: docker-pullable://prom/prometheus@sha256:a9fd401b348a41f00b8110f8b5e90c4e61caaf57ac0013ce6ed487bbb25a349d
Port: 9090/TCP
Args:
-config.file=/prometheus-data/prometheus.yml
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Thu, 16 Nov 2017 09:51:38 +0100
Finished: Thu, 16 Nov 2017 09:51:38 +0100
Ready: False
Restart Count: 300
Environment: <none>
Mounts:
/prometheus-data from data-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-5s54h (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
data-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: prometheus
Optional: false
default-token-5s54h:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-5s54h
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.alpha.kubernetes.io/notReady:NoExecute for 300s
node.alpha.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Pulling 1m (x301 over 1d) kubelet, gke-cluster-auto-devops--default-pool-47e3d392-fkvv pulling image "prom/prometheus:latest"
Normal Pulled 1m (x301 over 1d) kubelet, gke-cluster-auto-devops--default-pool-47e3d392-fkvv Successfully pulled image "prom/prometheus:latest"
Normal Created 1m (x301 over 1d) kubelet, gke-cluster-auto-devops--default-pool-47e3d392-fkvv Created container
Normal Started 1m (x301 over 1d) kubelet, gke-cluster-auto-devops--default-pool-47e3d392-fkvv Started container
Warning BackOff 14s (x6810 over 1d) kubelet, gke-cluster-auto-devops--default-pool-47e3d392-fkvv Back-off restarting failed container
Warning FailedSync 14s (x6810 over 1d) kubelet, gke-cluster-auto-devops--default-pool-47e3d392-fkvv Error syncing pod
What is the expected correct behavior?
That Prometheus would startup with the provided sample YML file.
Relevant logs and/or screenshots
Output of checks
This bug happens on GitLab.com