Skip to content

FY26 Q1 HAProxy/Traffic Routing DR Gameday

Disaster Recovery Gameday

Overview

Disaster recovery gamedays should be conducted no differently than a normal change request issue. Be aware that some gameday actions can interrupt services and appropriate caution should be taken.

It is advised that there be two individuals participating in the gameday process. One person will be responsble for following the process steps and executing them. The other person will be responsible for taking notes, logging problems, and approving any merge requests.

Most gameday processes should take between two and four hours to execute including cleanup time and preparation.

Prepare

  1. Create a new issue in the Production project and select the appropriate template for the gameday you are performing. Gameday Templates Location
  2. Read over the issue and familiarize yourself with the steps and access required.
  3. Ask any questions about the process in this issue. Feel free to reach out to the Production Engineering::Ops team via slack (#g_production-engineering_ops) or by pinging this issue's author.
  4. Complete any steps for preparing for the change, including seeking reviews and approvals and notifying of the pending gameday.

Execute

  1. Follow the steps of the gameday change issue.
  2. Record any notes or problems in this issue that are a result of the gameday.

Report

  1. Use this issue to report timing information.
  2. Re-assign this issue to the author.

Context and Reference

Edited by Pierre Guinoiseau