2025-10-14: Error rate SLO violation for ci_runner_jobs on saas-linux-large-amd64 shard
Error rate SLO violation for ci_runner_jobs on saas-linux-large-amd64 shard (Severity 4 (Low))
Problem: The error rate for CI runner jobs on the saas-linux-large-amd64 shard exceeded acceptable thresholds, triggering an SLO violation.
Impact: Over 98% of CI runner job failures on the saas-linux-large-amd64 shard were isolated to a single project. The error rate spiked to 0.7157%, causing a service level objective violation over the past 6 hours.
Causes: Unknown
Response strategy: We identified that the failures were concentrated in one project. The alert has now resolved and reported errors have decreased significantly.
This ticket was created to track INC-4790, by incident.io