Skip to content

Out-of-the-box NGINX alerts for auto-monitoring in Auto DevOps

Problem to solve

We can further reduce configuration required by our users to see value from the Monitor stage by adding out-of-the-box alerts for NGINX metrics that work right away once auto-monitoring is set up for Auto DevOps. This will...

  • Reduce time to value to users
  • Demonstrate alerting and encourage users to explore alerting

The Monitor stage is quickly maturing and its desirable to lower the barrier to entry for new users. Adding alerts to auto-monitoring in Auto-DevOps removes configuration and shows users how it works.

Intended Users

Sasha the Software Developer
Devon the DevOps Engineer
Sidney the Systems Administrator

Further Details

This work supports the Incident Management Vision

Proposal

For newly created projects, as soon as a user installs Prometheus on their Cluster, two default metrics would have the following alerts applied:

  1. On the Throughput metric, add an alert to auto-monitoring in Auto DevOps when 5xx status codes ≥ 0.1% for 10 minutes
  2. On the HTTP error rate metric, add an alert to auto-monitoring in Auto DevOps when 5xx errors ≥ 0.1% for 10 minutes

Additional details

  • These alerts should NOT automatically trigger the creation of incidents.
  • These behaviors should not apply to existing projects, or existing prometheus installs, only newly created ones. We don't want to inadvertently impact already existing alert settings.

Permissions and Security

Documentation

Documentation required. Please updates this section of the docs: https://docs.gitlab.com/ee/topics/autodevops/stages.html#auto-monitoring

Testing

What does success look like, and how can we measure that?

What is the type of buyer?

Links / references

Edited by Sarah Waldner