Limit the impact for far-reaching work
Context
Recently in Access, 3 production incidents were caused that could have been contained by using a feature flag. These incidents were related to the work done on Linear Namespaces, where we were initially very cautious about rolling out gradually and able to identify issues in advance. Outside of Linear Namespaces, all Access work is far-reaching and changes have the potential to cause disruption across all members. Unfortunately, this leaves little room for error and may require special circumstances.
Proposal
- Every ~"group::access" MR should be reviewed first by another ~"group::access" engineer
- Feature flags with gradual rollouts will be required for Linear Namespaces work, either by grouping the work behind a single flag or by using multiple flags.
- All ~"group::access" MRs must have the question asked "Should this be behind a feature flag?" before being merged. As the MR template sometimes has items that get skipped, we'll be modifying DangerBot to apply this. This is an effort to first remind engineers about feature flag usage, but second to also challenge their reasoning as to why changes should not be behind a flag.
Next steps
-
Identify other groups/work that has a large blast radius -
Consider if the above proposal makes sense for those groups
Edited by Michelle Gill