We are now able to deploy Prometheus servers to connected Kubernetes clusters with a single button (https://gitlab.com/gitlab-org/gitlab-ce/issues/41053). With GitLab's multi-cluster support, this can mean either a single or multiple Prometheus servers for a given project.
We now need to take this further to make the experience seamless:
Automatically enable Prometheus integration, configured for each deployed Prometheus instance
Logic to determine which Prometheus server to utilize for a given environment, if multiple clusters are connected.
Utilize the Kubernetes API to securely query the Prometheus servers without requiring them to be externally reachable outside the cluster
The general user flow will be:
User connects a Kubernetes cluster(s)
Deploys Helm Tiller, then Prometheus
At this point, Prometheus integration is now automatically enabled. Manually configuring the Prometheus URL is now disabled, instead showing that this is being automatically managed.
Monitoring UI elements now appear, and will function as expected for environments which have had a Prometheus server deployed.
In the event that the cluster for an environment does not have Prometheus deployed, they will see the empty state.
cc @pedroms for an upcoming 9.2/9.3 item. I would think this could be as simple as a single button to launch it and auto-configure.
There are a few places that we could perhaps put this:
On the empty state, where if k8s is already configured we can change the button to auto-deploy Prometheus and monitor the apps.
In the Prometheus service configuration.
In the Kubernetes configuration, once you were successful in configuring it. I particularly like this one, as we can instantly start getting metrics from k8s as well, so you will start to get data.
@joshlambert I like this idea. Just to be clear, this option would only appear if the project has k8s service enabled, correct?
In the Prometheus service page, if the user has triggered this “Prometheus server auto-deploy and configuration”, does it make sense to offer additional controls (e.g. stopping, restarting)? My question is: since we are doing this automation for the user, what kind of directions should we give them to manage the new Prometheus server? If any…
@pedroms Good questions! Yes, this would require the k8s integration to be enabled. I would think we would launch the prometheus container in the namespace of the project, @bjk-gitlab your thoughts there?
Once launched, we probably should have a button to stop and clean up the prometheus deployment, replica set, etc. Will add to description!
Yes, launching a Prometheus pod instance within the project namespace is a good way to do it.
The difficulty is how we handle persistence.
I would suggest we use a StatefulSet controller. This might be more complicated, because we will need to integrate with various kuberentes storage schemes like GCE or AWS EBS volumes.
It should be easy enough to detect which type of storage is available to the Kubernetes cluster and auto-provision a volume.
@joshlambert according to the scope in the description, here are the designs. Question:
do we have any idea of how long does the user have to wait for the Prometheus server to be installed on Kubernetes? I'm asking this to assess how the loading state should be designed.
Monitoring empty state
Kubernetes service
Prometheus service
For this page, I had to separate and disable the auto and manual configurations so that they wouldn't conflict with each other if the user decides to switch from one to the other.
@pedroms I would think a minute or two maximum to spin up the container, unless there is some issue. (Downloading the container image itself, or some resourcing issue with the K8s cluster.)
@joshlambert ok, thanks for the insight. I've updated the description with the latest designs and flow, adding an installing and uninstalling state. When pressing “Install Prometheus on Kubernetes” on the get started state, the user should be led to the installing state on the Service page. Let me know if you have any concerns.
Thanks @pedroms, looks good! One quick bit of feedback is on the automatically configured state, we should probably just show the namespace and pod name. So for example "gitlab\prometheus".
@joshlambert so instead of “…running on Kubernetes at http://localhost:9000” just show “…running on Kubernetes at gitlab\prometheus”? Is the reference a link to something?
@sarrahvesselov@pedroms, as @markpundsack has noted we are looking at a broader Kubernetes configuration method. This would include options to create an actual cluster or link an existing one, and then provision other common components like a CI runner. Prometheus would be part of that, as well.
Based on this new concept, can we revisit the design for this feature? While the broader version will probably not land until a later release, it would be worthwhile to still deliver this in 10.0 to lay the foundation and prove out some of the components. By moving this to the k8s page, we can also better align this.
Does that sound good? I believe you are also copied on the larger issue as well, for input.
Noting that we should use the helm chart unless there is a strong reason not to. From a glance, it looks like it should do what we need.
The one concern would be if we deliver this as part of 10.0, ahead of the rest of the items (https://gitlab.com/gitlab-org/gitlab-ce/issues/35956), how much of the groundwork (Helm Tiller installed, etc.) would we need to pick up.
@sarrahvesselov@pedroms ping back on this issue, to see if we can get some mockups for how this would look if we presented an option to deploy Prometheus on the K8s integration page once configured?
This is the desired end state: https://gitlab.com/gitlab-org/gitlab-ce/issues/27888, but Prometheus will likely be the leading issue to lay some of the foundation for deploying services into a cluster. Since this is all about cluster management, it makes sense to start with this in the Kubernetes area.
I am thinking it can be as simple as:
Button to deploy Prometheus once you have a cluster configured, and no other Prometheus server configured. It can be greyed out if you have Prometheus already on, with another server.
My concern here though is that when you Save an integration, I think you get returned back to the integrations list? That means you wouldn't see this unless you came back which is a problem.
Once deployed, MVP can just display a temporary banner indicating it is deploying. Later one we can provide more insight into status.
I removed the UX Ready label from this for now. Do you have time to address @joshlambert concerns @pedroms? If not, we can see if @cperessini or another UXer has time to take a peek.