Track Admission Control Success Metrics
Overview
As part of the Admission Control implementation (Phase 1), we need to instrument and track key success metrics to validate that the feature works correctly and provides the expected resource governance capabilities. This tracking should be included in the first iteration of the product.
Objectives
Implement monitoring, logging, and metrics collection for success indicators:
- Admission Decision Accuracy
- Quota Breach Incidents
- Resource Exhaustion Prevention
- Request Volume to Admission Controllers
- Response Time from Admission Controllers
1. Admission Decision Accuracy (Target: 100%)
What to measure:
- Percentage of admission decisions that correctly enforce configured policies
- Zero unauthorized runners admitted beyond quotas
2. Number of Quota Breach Incidents (Target: 0)
What to measure:
- Count of times actual concurrent connections exceeded configured quota limits
3. Number of Resource Exhaustion Incidents (Target: 0)
What to measure:
- Count of times system experienced overload that admission control should have prevented
4. Number of Requests to Admission Controllers
What to measure:
- Total request volume to admission controller management API endpoints
- Success vs. error response codes
5. Response Time from Admission Controllers
What to measure:
- Latency for admission decision evaluation
- Latency for management API operations (list, get, create, delete)
- Identify slow operations that could block runner registration
Acceptance Criteria
-
All five metric categories are instrumented and collecting data
Edited by 🤖 GitLab Bot 🤖