Group level prometheus only persists alerts for one project

Summary

Spun out of this comment: #26704 (comment 222571810)

When a group-level cluster has a GitLab-managed Prometheus, only the alerts for the project with the most recently updated alert are persisted on the Prometheus server.

Steps to reproduce

  1. Create a group with a group-level cluster

  2. Install Prometheus

  3. Create a couple of projects under the group and deploy both of them

  4. Connect to your kubernetes cluster and observe /etc/config/alerts on the server. For example

    kubectl -n gitlab-managed-apps exec -it $PROMETHEUS_POD_NAME -c prometheus-server watch cat /etc/config/alerts

    replacing $PROMETHEUS_POD_NAME with whatever the currently deployed prometheus server pod is.

  5. Create an alert on the first project and wait until they are reflected in /etc/config/alerts. Make note of the values

  6. Create an alert on the second project and observe that they overwrite all of /etc/config/alerts.

What is the current bug behavior?

/etc/config/alerts only contains alerts from one of the two projects

What is the expected correct behavior?

/etc/config/alerts should have alerts from both projects

Relevant logs and/or screenshots

Recording of the alerts file changing on a project I configured locally after modifying an alert on the second project, as described above. The projects in my case are called the-group/minimal-ruby-app and the-group/minimal-ruby-app-2.

alerts-changing

Results of GitLab environment info

Observed in GDK on recent master

Possible fixes

The code responsible is ultimately in Clusters::Applications::Prometheus#upgrade_command, but the #update_command inherited from ApplicationCore is wrong in a similar way, as it overwrites /etc/config/alerts with a blank file.

Alert persistence needs to be fixed in order for Prometheus chart updates to be feasible #26704 (closed)

Edited by Dov Hershkovitch