[META] Database Backup/Redundancy Improvements

Goals:

  • Faster/More Frequent Staging Refreshes (#2718 (moved))
  • Improve SLA numbers
  • Automate automate automate
  • Security security dsecurity

Track various proposed backup improvements

  • Documented SLA
  • Automated Restore Tests (#2709 (closed) and #1265 (moved))
  • Time-Delayed Standby #2722 (moved)
  • Logical replica
  • Replace WAL-E (#2721 (closed))
  • Change replication auth from password to something else, probably SSL cert
  • More reliable replication lag monitoring
  • Use local Azure blob storage instead of AWS S3 when running on Azure
  • Use a disk for WAL for backups instead of tar files in blobs so it can be snapshotted and mounted directly?
  • Use LVM or Azure snapshots for backups instead of tar?

c.f. BAD: #2671 (closed) and !346 and

Edited by Gregory Stark