Improve the rollback check command to handle paused or failed deployments

When a deployment is paused or failed, whether that be for the purposes of testing as we did today (production#4206 (closed)), our rollback check command will incorrectly display the results of differing packages. Using a recent demo as an example:

image

The previous deployment was fully deployed to the entire fleet but the post-deployment migration job was canceled to support the rollback demo. This meant the deployment was in a paused state (neither complete nor failed).

"Upcoming" was technically "current" as that was the package running on all of our servers. Our rollback command listed in the image SHOULD have read:

/chatops run deploy 13.12.202104270320-fa1dd0ee3d0.ac90a15ead2 --rollback

Proposed solution

The new output should not provide a rollback command that can be cut&pasted, we documented edge cases that require the release manager to figure out the proper package. Our rollback tooling is still in its infancy, let's first figure out all the edge-cases and only then automate the decision.

Here follow an actionable list of improvements ready to be implemented:

  • replace the package name in the rollback command with a placeholder
  • Rename upcoming to new
  • Provide package name for each version (new, current, and previous)
  • Provide a link to the coordinator pipeline for each version (new, current, and previous)
  • Review whether we need additional work to detect and report failed deployment #1737 (closed) (close this issue if not needed)

The final result should look like the following sketch:

New: `sha` (compare to current)
New package: `package name`
Coordinator pipeline <-- we can easily get this from the package name

Current: `sha` (compare to previous)
Current package: `package name`

Previous: `sha`
Previous package: `package name`

:book: view runbook

Rollback command: `/chatops run deploy --rollback --production <PACKAGE>` <-- without a package, just a placeholder
Edited by Mayra Cabrera