Skip to content

Instrument Triton GRPC client

Stan Hu requested to merge sh-add-grpc-instrumentation-triton-client into main

This uses py-grpc-prometheus and instruments the GRPC client. To do this, unfortunately we need to reach into the channel and gRPC client stub.

Sample output:

grpc_client_started_total{grpc_method="ModelInfer",grpc_service="inference.GRPCInferenceService",grpc_type="UNARY"} 2.0
grpc_client_started_created{grpc_method="ModelInfer",grpc_service="inference.GRPCInferenceService",grpc_type="UNARY"} 1.686638606783906e+09
grpc_client_handled_total{grpc_code="OK",grpc_method="ModelInfer",grpc_service="inference.GRPCInferenceService",grpc_type="UNARY"} 2.0
grpc_client_handled_created{grpc_code="OK",grpc_method="ModelInfer",grpc_service="inference.GRPCInferenceService",grpc_type="UNARY"} 1.686638607766525e+09
grpc_client_handling_seconds_bucket{grpc_method="ModelInfer",grpc_service="inference.GRPCInferenceService",grpc_type="UNARY",le="0.005"} 0.0
grpc_client_handling_seconds_bucket{grpc_method="ModelInfer",grpc_service="inference.GRPCInferenceService",grpc_type="UNARY",le="0.01"} 0.0
grpc_client_handling_seconds_bucket{grpc_method="ModelInfer",grpc_service="inference.GRPCInferenceService",grpc_type="UNARY",le="0.025"} 0.0
grpc_client_handling_seconds_bucket{grpc_method="ModelInfer",grpc_service="inference.GRPCInferenceService",grpc_type="UNARY",le="0.05"} 0.0
grpc_client_handling_seconds_bucket{grpc_method="ModelInfer",grpc_service="inference.GRPCInferenceService",grpc_type="UNARY",le="0.075"} 0.0
grpc_client_handling_seconds_bucket{grpc_method="ModelInfer",grpc_service="inference.GRPCInferenceService",grpc_type="UNARY",le="0.1"} 0.0
grpc_client_handling_seconds_bucket{grpc_method="ModelInfer",grpc_service="inference.GRPCInferenceService",grpc_type="UNARY",le="0.25"} 0.0
grpc_client_handling_seconds_bucket{grpc_method="ModelInfer",grpc_service="inference.GRPCInferenceService",grpc_type="UNARY",le="0.5"} 0.0
grpc_client_handling_seconds_bucket{grpc_method="ModelInfer",grpc_service="inference.GRPCInferenceService",grpc_type="UNARY",le="0.75"} 0.0
grpc_client_handling_seconds_bucket{grpc_method="ModelInfer",grpc_service="inference.GRPCInferenceService",grpc_type="UNARY",le="1.0"} 1.0
grpc_client_handling_seconds_bucket{grpc_method="ModelInfer",grpc_service="inference.GRPCInferenceService",grpc_type="UNARY",le="2.5"} 2.0
grpc_client_handling_seconds_bucket{grpc_method="ModelInfer",grpc_service="inference.GRPCInferenceService",grpc_type="UNARY",le="5.0"} 2.0
grpc_client_handling_seconds_bucket{grpc_method="ModelInfer",grpc_service="inference.GRPCInferenceService",grpc_type="UNARY",le="7.5"} 2.0
grpc_client_handling_seconds_bucket{grpc_method="ModelInfer",grpc_service="inference.GRPCInferenceService",grpc_type="UNARY",le="10.0"} 2.0
grpc_client_handling_seconds_bucket{grpc_method="ModelInfer",grpc_service="inference.GRPCInferenceService",grpc_type="UNARY",le="+Inf"} 2.0
grpc_client_handling_seconds_count{grpc_method="ModelInfer",grpc_service="inference.GRPCInferenceService",grpc_type="UNARY"} 2.0
grpc_client_handling_seconds_sum{grpc_method="ModelInfer",grpc_service="inference.GRPCInferenceService",grpc_type="UNARY"} 2.089886583
grpc_client_handling_seconds_created{grpc_method="ModelInfer",grpc_service="inference.GRPCInferenceService",grpc_type="UNARY"} 1.686638607766459e+09

Closes #153 (closed)

Merge request reports