Review how we're using Staging and the impact this has on deployments
In a couple of recent incidents our ways of using staging have led to us missing problems that then cause problems on Canary or Prod.
For example in production#3647 (comment 511979962) a feature flag was enabled on staging for testing, this then led to us missing a bug on the same widget. Failing tests on Canary resulted in an incident to resolve.
Use this issue to consider:
- What are all the ways that we're using Staging?
- How many incidents are a result of Staging being in a different state to our Canary/Prod deployment (feature flags as well as mixed deployments)
- Options for reducing the risk