Set up and configure alerts for Duo Code Review
do you know if we have alerts set up already for Duo Code Review?
I don't think so
Summary
This issues follow up on #552385 (closed).
Problem
Currently, Duo Code Review lacks alerting configuration, which limits our visibility into feature health and our ability to proactively identify and resolve issues. This observability gap could result in:
- Delayed incident detection and response
- Difficulty troubleshooting feature degradation
- Potential impact on user experience before issues are discovered
Proposal/Ideas
- Implement comprehensive alerting for the Duo Code Review to improve observability and incident response capabilities.
- Given the observability focus of this work, we should collaborate with groupobservability to ensure proper instrumentation and integration with existing monitoring infrastructure.
Success Criteria
- Comprehensive alerting implemented for critical Duo Code Review metrics (some ideas but not limited to: latency, error rates, throughput)
- Alerts integrated with existing incident management workflows (i.e. AlertManager Dashboard and Slack channel - #g_code_creation_alerts)
- Enhanced visibility into feature performance and error patterns
- Improved incident response capabilities through better diagnostic information
-
Runbooks updated with:
- Links to relevant alerts and dashboard to logs and clear troubleshooting steps for future incidents
- Escalation paths for different severity levels
Acceptance Criteria
- Alerts trigger appropriate for feature degradation scenarios
- Documentation is accessible and actionable for on-call engineers
- Setup follows GitLab's observability standards and best-practices