Add readinessProbe for Triton with rolling updates
🧩 Problem to solve
Currently there is no readinessProbe
for the triton server in CS Model Gateway. This MR hopes to fix that.
💡 Proposal
The recent effort in creating liveness/readiness probes in Triton was missing the ability to keep pods running at all times while still checking the health of the pods. This MR fixes that by introducing Rolling Updates which will not kill a pod until the newest pod is ready to take over functionality/services. This will be implemented with the readinessProbe which will utilize triton's built in api health request endpoint: /v2/health/ready
.
⚔ ️ Plan of attack
Use a combination of this effort, the information on rolling updates, and documentation to implement a readiness probe in Model Gateway.
Note for Reviewer
The values maxSurge
and maxUnavailable
are set at the default settings. They are not required to be present in the yaml file if at default, but figured they should be there for transparency.
cc @mray2020 @tle_gitlab @AndrasHerczeg @srayner
Relates to #14 (closed)