Improve the rollback check command to handle paused or failed deployments
When a deployment is paused or failed, whether that be for the purposes of testing as we did today (production#4206 (closed)), our rollback check command will incorrectly display the results of differing packages. Using a recent demo as an example:
The previous deployment was fully deployed to the entire fleet but the post-deployment migration job was canceled to support the rollback demo. This meant the deployment was in a paused state (neither complete nor failed).
"Upcoming" was technically "current" as that was the package running on all of our servers. Our rollback command listed in the image SHOULD have read:
/chatops run deploy 13.12.202104270320-fa1dd0ee3d0.ac90a15ead2 --rollback
Proposed solution
The new output should not provide a rollback command that can be cut&pasted, we documented edge cases that require the release manager to figure out the proper package. Our rollback tooling is still in its infancy, let's first figure out all the edge-cases and only then automate the decision.
Here follow an actionable list of improvements ready to be implemented:
-
replace the package name in the rollback command with a placeholder -
Rename upcomingtonew -
Provide package name for each version ( new,current, andprevious) -
Provide a link to the coordinator pipeline for each version ( new,current, andprevious) -
Review whether we need additional work to detect and report failed deployment #1737 (closed) (close this issue if not needed)
The final result should look like the following sketch:
New: `sha` (compare to current)
New package: `package name`
Coordinator pipeline <-- we can easily get this from the package name
Current: `sha` (compare to previous)
Current package: `package name`
Previous: `sha`
Previous package: `package name`
:book: view runbook
Rollback command: `/chatops run deploy --rollback --production <PACKAGE>` <-- without a package, just a placeholder
