Skip to content

WIP: Try out anomaly detection experiment

Adrien Kohlbecker requested to merge ak/anomaly-detection-experiment into master

What does this MR do?

This is a very early stage experiment reproducing what is outlined in the Anomaly Detection Using Prometheus blog post.

Behind the :anomaly_detection_experiment feature flag on a project, it configures an alert to fire if the number of requests going to NGINX (per zone) is farther than 2 standard deviations, as compared to last week's metrics.

This is discussed at length here, specifically:

The smallest change we want to ship is to take a relevant metric (Kenny suggested nginx http requests), and set up an alert to fire when it is outside N standard deviations of the weekly moving average.

While we originally discussed not being able to disable the alert (or at least being unsure if that was possible), I was able to put this behind a feature flag.

Caveats:

  • This is hard to test locally as you need at least a week worth of metrics, any suggestion of how to approach testing would be welcome
  • We don't know whether the customer's metrics have a normal distribution, if they don't this approach is probably flawed
  • We also don't know whether 2 standard deviations is a reasonable limit for their data
  • I'm currently unsure when the PrometheusUpdateService runs, I've found at the minimum when adding or removing an alert through the UI

Does this MR meet the acceptance criteria?

Conformity

Performance and testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by 🤖 GitLab Bot 🤖

Merge request reports