Pod capacity drops during deployments

During deployments of higher load times, our Apdex takes a hit. Both our ServiceAPI and ServiceWeb appear to be negatively impacted by this.

Sometimes this is harsh enough to alert the EOC.

Utilize this issue to investigate why this is happening. Ideally the tuning of the deployment should enable us to cleanly rollover new Pods without having any sort of negative impact to this metric.

Questions

Is the application suffering when taking it's first few requests?
Are Kubernetes deployments not well tuned?
Are the metrics capturing incorrect data?
Is there an imbalance of traffic?
...

Results of this Issue

Reasoning has been found. Our deployment objects for some deployments, container registry, webservice, contain the spec.replicas definition. Kubernetes will see this being applied for every deploy and configuration change. This in turn removes a set of Pods from service during changes to the deployment.

Fun investigative conversation can be found on this thread below: #1992 (comment 674659356)

After a modification to our chart to remove spec.replica counts, further deploys no longer see this issue, the result of which can be found in the below thread: #1992 (comment 678236039)

Edited Sep 15, 2021 by John Skarbek