Unable to check replication using UI if replication is broken
Summary
A customer was unable to confirm that replication was broken. We have a tool in the UI to check this, but they were unable to login to the UI (because replication was broken). It was also difficult to ascertain at the database layer if replication was working because there was not a lot of data available locally to replicate.
- Replication broke and the customer was trying to figure out what was going on
- When they went to the secondary to log in, they were redirected to the primary (as expected)
- But they couldn't log in (because replication was broken)
- We didn't know that replication was completely broken, so we were trying to repair the login part to get more information about what was wrong
- We suggested logging into primary and destroying the secondary node to recreate the application
- The customer did that, and tried to log into the secondary. They were redirected to the primary and login failed again
- When the customer logged into the primary as normal, in the admin for geo nodes he got a 404 and no option to repair authentication
Steps to reproduce
In theory this could be replicated by:
- Set up a primary node with a secondary node
- Stop replication
- Remove and re-add the node on the primary node's admin screen
- Log-out of the secondary node
- Try to log into the secondary (get the expected redirect to the primary)
- Try to log into the primary, it will display a OAuth login error:
"Client authentication failed due to unknown client, no client authentication included, or unsupported authentication method."
What is the current bug behavior?
The user is unable to log into the secondary node, and we provide no useful feedback why.
What is the expected correct behavior?
We should provide useful feedback that database replication may not be working correctly.
One possible way to do that is by probing at the replication slots view: https://www.postgresql.org/docs/9.6/static/view-pg-replication-slots.html
Relevant logs and/or screenshots
Support Ticket: http://support.gitlab.com/hc/requests/104514
Edited by 🤖 GitLab Bot 🤖