Staging-Canary updates: Add new job within staging-canary and for testing migrations
Problem
A recent incident caused issues for some customers. This is related to a migration that caused incompatibility between canary and production level services.
Proposed Solution
Add a new job executed within the staging-canary
deployment just after migrations are applied, but before application updates to attempt to capture this situation early. Due to timing differences within staging
and production
environments, we did not capture the event with our current mixed environment test strategy.
Related to investigation work in #1815 (closed)
Tasks
-
Identify deployment injection point -
Identify helpful specs to capture issue -
Tag related specs for setting up new test pipeline (merged) -
Add new job in pipeline-common (in test) -
Move job definition and include logic to staging-canary Setup QA project for new test pipelineSetup Ops mirror for new test pipeline[Updategitlab-qa
gem for handling new pipeline]-
Manually test pipeline for stability and reporting (WIP) -
Inject pipeline execution into ops deployment within staging-canary
environment (WIP) -
Update pipeline triage documentation -
Update infrastructure environment documentation (draft)
Edited by Zeff Morgan