Provide descriptive messaging when Gitaly is unreachable
Problem to solve
If Gitaly becomes unreachable, there are various places throughout the application that will be impacted. This includes:
- Dashboard
- Exploring projects
- Accessing a repository
- File browsing
It was previously discussed that we show a warning alert and identify projects that are affected along with information regarding why we can't load all the information requested, and what can be done to resolve the issue.
The problem is that, in order to identify projects that are affected, we'd need to request the uncached status from Gitaly, which would increase the load to Gitaly at a time when it is unstable or unreachable. Additionally, these statuses would ideally need to be displayed across the application since there will be several different pages impacted. We'll need to solve communicating more broadly what the status is, versus individually on pages or projects.
Intended users
All personas would be affected by Gitaly being unreachable.
Proposal
As discussed in #205488 (comment 329787521):
Show a warning message that something went wrong (when GRPC::Unavailable
and GRPC::DeadlineExceeded
were raised) and not all content are displayed. Do this for every request including AJAX ones but the message should only show once and on the page where the problem was experienced.
Previous Proposal
We want to communicate to users these things: * **What** the problem is and what it may cause (Gitaly is unreachable, XYZ data will be unavailable) * **How** can it be resolvedOne suggestion is to display a custom 500 error page when the user tries to access a project that is unreachable, and to monitor Gitaly's server status in order to show a global warning alert.
Permissions and Security
Documentation
Availability & Testing
- As noted, availability could be affected if monitoring Gitaly's status adds additional load when it's already unstable.
- Unit/integration tests should provide most of the coverage, but a new E2E test should be added for extra confidence.
- The change isn't expected to affect existing E2E tests.