Perform a database failover in staging so the DB team is more familiar with this process
Since the introduction of repmgr the DB team hasn't really been able to get familiar with this new approach, meaning Production is currently the only group that knows how this works. Part of our OKR is to document/train people for this process: https://gitlab.com/gitlab-com/infrastructure/issues/3449
To achieve this I would like to plan a one hour-ish meeting where the entire DB team pairs up with a production engineer to perform a failover in staging. The way I see this going is roughly like this:
- The production engineer runs the failover and explains things
- Every DB engineer performs the same, while sharing their screen
- Based on the outcome we make sure there's documentation (if not already the case) and that this documentation can be found from the DB handbook page
Who from the production team would be best suited to help us out with this?
Edited by Yorick Peterse