Probe Google Cloud API for expected responses from key cloud service dependencies
Summary
During the most recent Google Cloud Platform (GCP) outage, it took longer than desired to definitively identify GCP issues as the root cause. If we were regularly probing the GCP API, we could have coalesced the application alerts with cloud service alerts to identify GCP errors as the underling root cause.
Definition of Done
-
Identify the key GCP services that we should be testing for errors (e.g. Google Cloud Storage (GCS))
For each of the identified services:
-
Probe Google Cloud APIs for proper error response codes: 2XX, 4XX, & 5XX -
Make a request with incorrect credentials -
Make a request to an invalid endpoint -
Make a valid request that's expected to succeed
-