PostgreSQL: Using delayed replicas for disaster recovery
(Working title)
See https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/5255 .
Topics to cover (brainstorming):
- Delayed and archive replicas
- Use cases, why is it important
- Time to recovery using the delayed replica
- Include comparison of S3 vs GCS in terms of recovery speed
- Relate to postgresql's implementation of how to delay WAL replay
- Practical guide to recovery using delayed replica
- Use waldump to figure out tx id or defer to timestamps
- Recommendation to log ddl statements along with tx id
- Use real-world example (web-ide label deletion)
- Why delayed replica only really makes sense with WAL archive (not SR: What if primary is corrupt?).
- Show why archive testing is important to check for WAL corruption (to ensure PITR possibilities)
- Replication is not a backup tool - or is it?
Edited by Andreas Brandl