Skip to content
  • Sergii Tkachenko's avatar
    xDS interop: Improve retry logic and logging for the k8s retry operations (#30607) · 5abe9701
    Sergii Tkachenko authored
    - Changes the order of waiting for pods to start: wait for the pods first, then for the deployment to transition to active. This should provide more useful information in the logs, showing exactly why the pod didn't start, instead of generic "Replicas not available" ref b/200293121. This also needed for https://github.com/grpc/grpc/pull/30594
    - Add support for `check_result` callback in the retryer helpers
    - Completely replaces `retrying` with `tenacity`, ref b/200293121. Retrying is not longer maintained.
    - Improves the readability of timeout errors: now they contain the timeout (or the attempt number) exceeded, and information why the timeout failed (exception/check function):
      Before:  
      > `tenacity.RetryError: RetryError[<Future at 0x7f8ce156bc18 state=finished returned dict>]`
      
      After:
      > `framework.helpers.retryers.RetryError: Retry error calling framework.infrastructure.k8s.KubernetesNamespace.get_pod: timeout 0:01:00 exceeded. Check result callback returned False.`
    - Improves the readability of the k8s wait operation errors: now the log includes colorized and formatted status of the k8s object being watched, instead of dumping the full k8s object. For example, here's how an error caused by using incorrect TD bootstrap image:
    5abe9701
This project manages its dependencies using pip. Learn more