Testing restoring from backup procedures
In our post-mortem blog post, we wrote:
8. Why was the backup procedure not tested on a regular basis? - Because there was no ownership, as a result nobody was responsible for testing this procedure.
Now, I see a number of issues relating to setting up (streaming) backups (#1152 (closed), #1239 (closed)) and having monitoring in place (#1199 (moved)) to make sure that those backups are indeed happening. But I don't see an issue yet to explicitly test the restoring from backup procedure on a regular basis. What do we have documented about this? I found nothing in the runbooks, (but there is an open issue that calls for more documentation generally #1138 (closed)). So to be concrete about it:
- Do we have a "restore from backups" procedure?
- If yes: where?
- If no: do we have enough systems setup to be able to write this up?
- If yes: @pcarranza can you nominate someone to write it up
- If no: @pcarranza can you describe the steps that need to be completed to get to a full path to restoration?
- What am I forgetting to ask?