Skip to content

Corrective action: Container registry Apdex update

Summary

During production#14260 (closed) we saw an increase in error rates, specifically with the message: UKNOWN BLOB, these errors were not counted against the Container Registry Apdex, which was confusing.

After the incident the Apdex was still a healthy 100%, when in reality, the service was mostly unusable for the duration of the incident, affecting many workflows and pipelines.

Related Incident(s)

Originating issue(s): production#14260 (closed)

Desired Outcome/Acceptance Criteria

The Apdex includes the error rates, and pages EOC on a drop.

Associated Services

Corrective Action Issue Checklist

  • Link the incident(s) this corrective action arose out of
  • Give context for what problem this corrective action is trying to prevent from re-occurring
  • Assign a severity label (this is the highest sev of related incidents, defaults to 'severity::4')
  • Assign a priority (this will default to 'Reliability::P4')