Add documentation for how to detect/alert data loss
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
Issue
It has come up in multiple discussions with internal support teams and external customers that our documentation surrounding the detection of data loss which could require intervention with the Gitaly Cluster is unclear / incomplete.
Proposal
Documentation currently discusses how repositories identified as having data loss are marked as read-only.
Documentation also mentions the available counter in Grafana dashboards.
What appears to be missing is clear documentation surrounding a possible data loss scenario:
- How possible data loss is determined
- What automated steps are taken by Gitaly Cluster when potential data loss is detected
- How an administrator is alerted to this or can see this has occurred
- What manual steps may need to be taken by an administrator
Edited by 🤖 GitLab Bot 🤖