2025-10-20: dotcom impact from aws outage
dotcom impact from aws outage (Severity 3 (Medium))
Problem: A upstream service disruption caused a ripple effect, including a spike in 502 errors occurred in the web service when the dependency proxy could not retrieve container images from Docker Hub.
Impact: Between 08:00 and 09:35 UTC, users saw increased 502 errors when the web service tried to pull container images not present in our cache. Errors began as early as 06:45 UTC but peaked after 08:00.
Causes: Docker Hub experienced an outage due to an AWS incident, which prevented our dependency proxy from fetching container images that were not already cached.
Response strategy: Monitoring the symptoms and impact, once the root cause of the upstream provider was resolved, the services recovered.
This ticket was created to track INC-5020, by incident.io