Sharding Working Group: Review database incidents and determine if/how sharding would have helped.
This is a follow up item from the Database Sharding Working Group agenda on Feb, 19 link
Please add any database related incidents that may have benefitted, or equally as valuable, been much worse if we had sharding in place.
- Start a new comment thread with the issue
- Add a description of how the incident would have been better/worse with sharding
On a weekly basis the description will be updated with a quick summary. This feedback can help us to better define our sharding strategy going forward.
The issue list referenced in the agenda: https://docs.google.com/document/d/1_sI-P2cLYPHlzDiJezI0YZHjWAC4BSKJ8aL0cNduDlo/edit#bookmark=id.9vpt8x31r9b8
GitLab list - https://gitlab.com/gitlab-com/gl-infra/production/issues?scope=all&utf8=%E2%9C%93&state=all&label_name[]=incident
Breakdown by Priority and Severity
The list of all incidents is over 500. It was suggested we prioritize by Priority labels. Here's a quick breakdown.
| Priority | Count |
|---|---|
| ~P1 | 23 |
| ~P2 | 7 |
| ~P3 | 30 |
| ~P4 | 0 |
| ~S1 no Priority | 42 |
| ~S2 no Priority | 92 |
| ~S3 no Priority | 128 |
| ~S4 no Priority | 165 |
| No priority or severity | 36 |