Implement Prometheus Metrics for Testing Connectivity to External Dependencies
Goal
Currently for our readiness and liveness probes we check the /api/v1/minds/config
endpoint, which does not evaluate connectivity to external dependencies. We can implement a dedicated health endpoint that tries to connect to these external services and returns a 200
status code only if the engine is able to communicate with them.
What needs to be done
Implement a health endpoint that confirms connectivity to the following services:
- Cassandra
- Elasticsearch
- Redis
- Sendgrid
- SQS
- Permaweb
- Web3 Server
We'll also want to update the readiness and liveness probes in our Helm chart.
QA
We can test this in Sandbox by applying a NetworkPolicy that restricts access to the required dependencies, which should cause the endpoint to fail and therefore the pod to show as unready.
UX/Design
N/A
Personas
N/A
Experiments
N/A
Acceptance Criteria
-
Health endpoint is added to engine -
Helm charted is updated with readiness and liveness probes
Definition of Ready Checklist
-
Definition Of Done (DoD) -
Acceptance criteria -
Weighted -
QA -
UX/Design -
Personas -
Experiments
Edited by Zack Wynne