Skip to content

chore: upgrade prometheus stack v19.1.0

Steve Xuereb requested to merge steveazz/update-prom-stack-ops-ci-gstg into master

Background

Upgrade the helm chart version from 10.3.5 to 19.1.0 so that we use the latest stable version of the chart and be able to define startUpProbe which is needed for a corrective action.

Solution

This is the second attempt to upgrade the helm chart version and the Prometheus operator, the first attempt can be found in !494 (merged) with a detailed description of why we had to. A fix upstream has been merged but not yet released in https://github.com/prometheus-operator/prometheus-operator/pull/4309.

Use the latest stable version of the helm chart and specify a specific the version of the operator using the GitHub Container registry image since that has permanent storage.

Validation that this specific commit fixes the our validation issue can be see in this commit

Upgrade

As part of the upgrade process we have to update the CRDs, this is not yet automated, so it's done manually in gitlab-com/gl-infra/production#5731 (closed)

0a38647379a5e93f639bf8e634deabcc32e01fb6 is the commit that fixes the alertmanager configuration validation.

kubectl apply -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/0a38647379a5e93f639bf8e634deabcc32e01fb6/example/prometheus-operator-crd/monitoring.coreos.com_alertmanagerconfigs.yaml
kubectl apply -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/0a38647379a5e93f639bf8e634deabcc32e01fb6/example/prometheus-operator-crd/monitoring.coreos.com_alertmanagers.yaml
kubectl apply -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/0a38647379a5e93f639bf8e634deabcc32e01fb6/example/prometheus-operator-crd/monitoring.coreos.com_podmonitors.yaml
kubectl apply -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/0a38647379a5e93f639bf8e634deabcc32e01fb6/example/prometheus-operator-crd/monitoring.coreos.com_probes.yaml
kubectl apply -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/0a38647379a5e93f639bf8e634deabcc32e01fb6/example/prometheus-operator-crd/monitoring.coreos.com_prometheuses.yaml
kubectl apply -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/0a38647379a5e93f639bf8e634deabcc32e01fb6/example/prometheus-operator-crd/monitoring.coreos.com_prometheusrules.yaml
kubectl apply -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/0a38647379a5e93f639bf8e634deabcc32e01fb6/example/prometheus-operator-crd/monitoring.coreos.com_servicemonitors.yaml
kubectl apply -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/0a38647379a5e93f639bf8e634deabcc32e01fb6/example/prometheus-operator-crd/monitoring.coreos.com_thanosrulers.yaml

reference https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/13973

Merge request reports