Improve our strategy around pods (in particular Prometheus) failing to start

One approach to this problem might be:

  • leverage startupProbe once we upgrade the prometheus operator
  • set up alerts on pods failing startupProbe
Edited by Michal Wasilewski