Skip to content

WIP: Resolve "Geo: Inconsistent database replication status when Wal-E streaming is in use"

What does this MR do?

In lieu of a better way to actually determine if replication is working, we'll reduce it to the check that replication is setup by way of having the secondary database in read only/recovery mode, which is already checked by the line

        return 'The Geo node has a database that is not configured for streaming replication with the primary node.' unless self.database_secondary?

Are there points in the code the reviewer needs to double check?

Why was this MR needed?

As mentioned in the issue, we're seeing the message The Geo node does not appear to be replicating the database from the primary node on the Geo secondary in production (gprd). This is triggered by the line

        return 'The Geo node does not appear to be replicating the database from the primary node.' if Gitlab::Database.pg_stat_wal_receiver_supported? && !self.streaming_active?

in https://gitlab.com/gitlab-org/gitlab-ee/blob/master/ee/lib/gitlab/geo/health_check.rb even though replication is in fact working.

This is what is seen in the secondary console:

Gitlab::Geo::HealthCheck.streaming_active?
=> false
Gitlab::Database.pg_stat_wal_receiver_supported?
=> true

Screenshots (if relevant)

Does this MR meet the acceptance criteria?

What are the relevant issue numbers?

Closes #5933 (closed)

Edited by Brett Walker

Merge request reports