One of the tasks we have in our release check list is to manually confirm whether the version on the help page of the deploy instance matches the one that we expect to install. As we've seen, it is trivial to get distracted and not complete that task correctly.
Proposal
Add to release-finish and pre-finish a job that would query the API of the deployment instance and compare the versions of the package that got deployed to the version that the application is reporting.
In case of the same version, take no further action. In case of a failure, trigger a slack notification with @ mention of release-managers to inform that that something needs to be investigated.
Add to release-finish and pre-finish a job that would query the API of the deployment instance and compare the versions of the package that got deployed to the version that the application is reporting.
I suggest these jobs added to the <ENV>-qa stage. This is technically a quality check, logistically, it makes sense to put it there in my opinion. But if these jobs are in the QA stage and it fail, it'll trigger our failure notification job as desired already. Which means there's no new alert that needs to be created.
I suggest these jobs added to the <ENV>-qa stage.
I agree that this is a quality check but I wonder if it will be obvious enough to Quality and us that failing checks are our responsibility to investigate?
Is there any reason why Quality shouldn't add this to the list of items that are checked when QA is run against an environment?
I think we should create and own these checks since they relate to release tooling rather than product regressions.
In order to do this in QA pipelines, we need the triggered Quality pipeline to know the version we're deploying, and then that needs to make that value available to the actual test runs in the environment.
The former sounds like we already do this via the $RELEASE environment variable, however looking at a recent release.gitlab.net run, that looks to be empty:
Since we are interfacing with the pipeline-trigger library, hopefully we can pass this in as an additional option to the trigger? Or cleaner, if supported via an option that gets passed to the qa task:
All the pieces for this are now merged. I'm trying to devise a way to test it without doing an actual release. I think we can just trigger a quality/release pipeline, giving it the wrong DEPLOY_VERSION variable and seeing if it fails?
Version sanity checkD, [2020-09-21T11:37:19.019107 #22] DEBUG -- : Starting test: Version sanity check is the specified version is the specified version