2021-10-27: Upgrade prometheus helm chart gprd

Production Change

component	before	after
`prometheus-operator`	v0.42.1	master@sha256:bb79240165868c7d73d3db2b45bd065bf2b3050729aa4809f6de79cace232feb
`kube-state-metrics`	v1.9.7	v2.2.0
`prometheus-community/kube-prometheus-stack`	10.3.5	19.1.0

Upgrades to environments were done in the following change management issue:

Estimated Time to Complete (mins) - 1

Estimated Time to Complete (mins) - 50

Estimated Time to Complete (mins) - 10

Verify that all pods have restarted: kubectl -n monitoring get po apart from thanos, memcached, gitaly-exporter
Verify that the new operator version is running: kubectl -n monitoring get po gitlab-monitoring-promethe-operator-7dc8f7b879-4dk88 -o json | jq .spec.containers[0].image expected value is "ghcr.io/prometheus-operator/prometheus-operator:master@sha256:bb79240165868c7d73d3db2b45bd065bf2b3050729aa4809f6de79cace232feb"
Take a look at the operator logs and check if there are any error level logs: kubectl -n monitoring logs gitlab-monitoring-promethe-operator-7dc8f7b879-4dk88 --since=5m. If there is a large amount of logs you can filter for error level kubectl -n monitoring logs gitlab-monitoring-promethe-operator-7dc8f7b879-4dk88 --since=5m | grep 'err'
Verify that service discovery is working curl -s -L $(kubectl -n monitoring get svc prometheus-headless -o json | jq '.metadata.annotations["external-dns.alpha.kubernetes.io/hostname"]' -r):9090/metrics | grep 'scrape_pool_targets'
Check the ingress is working as expected: https://console.cloud.google.com/kubernetes/ingresses?project=gitlab-production&pageState=(%22savedViews%22:(%22i%22:%226c0e9c818063462585995d31405639f5%22,%22c%22:%5B%5D,%22n%22:%5B%5D),%22ingress_list_table%22:(%22f%22:%22%255B%255D%22)) if any backends are reporting unhealthy investigate

Estimated Time to Complete (mins) - 5

~~Does this change introduce new compute instances?~~
~~Does this change re-size any existing compute instances?~~
~~Does this change introduce any additional usage of tooling like Elastic Search, CDNs, Cloudflare, etc?~~

Edited Oct 27, 2021 by Steve Xuereb