Metrics to track failed deployments
Summary
We would like to track the number of failed deployments on each environment.
Proposal
Add a new metric called delivery_deployments_completed_total
similar to the existing delivery_deployment_started_total
metric.
delivery_deployments_completed_total{target_env=""}
should be incremented when deployment to an environment is completed.
- Add a class similar to the existing
TrackDeploymentStarted
calledTrackDeploymentCompleted
. - Add a new rake task similar to the existing
metrics:deployment_started
, calledmetrics:deployment_completed
. - Add jobs for each environment to call the above new rake task, similar to the existing
metrics:deployment_started:env
.
We can then have computed metrics using the above new metric:
-
delivery_deployments_started_total{target_env="gprd"} - delivery_deployments_completed_total{target_env="gprd"}
gives us the number of failed deployments togprd
. -
delivery_deployments_started_total{target_env="gstg-cny|gstg-ref|gprd-cny|gstg|gprd"} - delivery_deployments_completed_total{target_env="gstg-cny|gstg-ref|gprd-cny|gstg|gprd"}
gives us the number of failed deployments to all environments. -
delivery_packages_tagging_total{pkg_type="auto_deploy"} - delivery_deployments_started_total{target_env="gstg-cny"}
gives us the number of deployments that failed before reachinggstg-cny
. -
The above 2 computed metrics combined, gives us all failed deployments.