Support alerting for custom dashboard (area-charts)

Problem to solve

Users should be able to set alerts on metrics defined in dashboard yml files in the project. The ability is available for gitlab-defined "common" metrics & custom metrics created in the UI, but not yet for yml-defined metrics. As a first step, we should look at bringing metrics defined for area-charts up to parity with existing metrics. Metrics for other panel types should come in another iteration.

Open questions:
- When do we want to persist metrics? Pipeline?
  - Should this step include validation & fail the build for invalid dashboards?
- Do we want to clean-up project metrics which are no longer present in a dashboard? Or do they live forever?
- Do we want to perform reconciliation between custom metrics created in the UI & dashboard yml-defined metrics?

Intended users

Further details

Proposal

Technical Implementation Proposal:

Refactor CommonMetricsImporter to support project-defined metrics; Include a follow-up step that deletes removes project-defined metrics (probably distinguishing from custom metrics by the presence of the identifier from the yml)
Call updated importer from pipeline? (This is the part I'm fuzzier on and has potential for scope creep)
Update Metrics::Dashboard::Processor stages to account for project-defined dashboard metrics
Test alerting (Should work out of the box, but we'll need to test)

First iteration:

Should the persistence step include validation & fail the build for invalid dashboards? No.
Do we want to clean-up project metrics which are no longer present in a dashboard? Or do they live forever? No, they are removed.
Do we want to perform reconciliation between custom metrics created in the UI & dashboard yml-defined metrics? No. If a metrics has been defined in both spots, the user can do any cleanup they want to.

Permissions and Security

Documentation

Testing

integration tests as usual
end to end test
consider to trigger package-and-qa on MR to ensure existing end-to-end tests are not breaking (an alert test is defined but only for common metrics)

What does success look like, and how can we measure that?

Links / references

Edited Jun 18, 2020 by Sofia Vistas