Registry on GKE infrastructure readiness review
Please review the following document and leave your questions inline.
If you have no remarks about the content of the text, consider the following questions:
- Is there enough documentation in our runbooks such that we can recover from a failure of the Container Registry Service?
- Are there appropriate Pages in place such that we get alerted when the Container Registry begins to fail?
- Is the documentation in our runbooks sufficient that if the bus factor is high, another member of Infrastructure can successfully rebuild both the cluster, and deploy the Container Registry?
- Do we pass security guidelines with a nicely hardend installation and use of GKE?
- Is there anything questionable with the method chosen to deploy the Container Registry?
- Are there any improvements someone would suggest to either our deployment mechanism, the use of GKE, or the configuration of the Container Registry?
- Do team members have knowledge of being able to successfully setup and connect to the Kubernetes Clusters and perform troubleshooting processes?
- Can team members find the logs and dashboards we've created to assist with monitoring both Kubernetes and the Container Registry?
Closes #2 (closed)
As an experiment we are trying an MR approach to review, if you feel like the process is a bit clumsy and not working well also please let us know so we can try something different next time. There are two MRs for the review, one for security and one for everyone else: