Follow Up from Monitoring Future Vision Workflow Discussion

TL;DR - Over-rotated on future vision content. Use it to draft out long-term but stay focused on what's next. More important to communicate a clear what's next then a clear future vision. Transition between the two more quickly. It is easier to add depth to future vision when you should be focusing on what's next.

Tasks

Follow ups from Convo with Sid on Monitoring workflow:

  • Remove workflow from Monitor Future Vision - @kencjohnston - www-gitlab-com!35312 (merged)
  • Add What's Next section for MOnitor vision add Triage Minimal workflow diagram, link to a Triage Minimal Epic - @dhershkovitch - www-gitlab-com!35320 (merged)
  • Start research on minimal workflow as defined in Epic - @dhershkovitch

Workflow

  • Name transitions
  • Consider - Alert->Metric->Log-Trace
  • Consider Auto-defined alert for errors (500 rate) and default incident templates
graph TB;
A[Alerts] -->|Embedded Metric Chart in Incident|B
B[Metrics] -->|Timespan Log Drilldown|C
C[Logs] -->|TraceID Search|D[Traces]

Example - 500 errors:

  • Create alert at 5% 500 error rates from NGINX
  • Use auto-provided incident template that displays 500 error rate metric (and graph will include relevant deploys)
  • Provide an immediate link to Logs to view/search (targeted to time around the time span, filtered by container or pod or service).
  • Logs to tracing is difficult - not needed for minimal? Might be easier with traces in ES? Logs would need to carry the trace ID. Could filter to trace ID.
Edited Nov 26, 2019 by Dov Hershkovitch
Assignee Loading
Time tracking Loading