Skip to content

Add readinessProbe for Triton with rolling updates

Dylan Bernardi requested to merge add-readiness-probe into main

🧩 Problem to solve

Currently there is no readinessProbe for the triton server in CS Model Gateway. This MR hopes to fix that.

💡 Proposal

The recent effort in creating liveness/readiness probes in Triton was missing the ability to keep pods running at all times while still checking the health of the pods. This MR fixes that by introducing Rolling Updates which will not kill a pod until the newest pod is ready to take over functionality/services. This will be implemented with the readinessProbe which will utilize triton's built in api health request endpoint: /v2/health/ready.

️ Plan of attack

Use a combination of this effort, the information on rolling updates, and documentation to implement a readiness probe in Model Gateway.

Note for Reviewer

The values maxSurge and maxUnavailable are set at the default settings. They are not required to be present in the yaml file if at default, but figured they should be there for transparency.

cc @mray2020 @tle_gitlab @AndrasHerczeg @srayner

Relates to #14 (closed)

Edited by Andras Herczeg

Merge request reports