Geo Secondary shows unhealthy after enabling FDW in 10.6.4
Zendesk ticket: https://gitlab.zendesk.com/agent/tickets/94818
Summary
After an upgrade from 10.5.4 to 10.6.4, the secondary geo node shows as unhealthy and there is a warning that the Foreign Data Wrapper schema is out of sync in /admin/geo_nodes.
Relevant logs
gitlab-ctl reconfigure shows:
* postgresql_fdw[gitlab_secondary] action create
* postgresql_query[enable postgres_fdw extension on gitlabhq_geo_production] action run (skipped due to not_if)
* postgresql_query[create fdw gitlab_secondary on gitlabhq_geo_production] action run (skipped due to not_if)
* postgresql_query[update fdw gitlab_secondary on gitlabhq_geo_production] action run (skipped due to not_if)
gitlab-rake gitlab:geo:check shows:
Checking Geo ...
GitLab Geo is available ... yes
GitLab Geo is enabled ... yes
GitLab Geo secondary database is correctly configured ... yes
Using database streaming replication? ... yes
GitLab Geo tracking database is configured to use Foreign Data Wrapper? ... yes
GitLab Geo tracking database Foreign Data Wrapper schema is up-to-date? ... no
Try fixing it:
Follow Geo setup instructions to configure secondary nodes with FDW support
If you upgraded recently check for any new step required to enable FDW
If you are using Omnibus GitLab try running:
gitlab-ctl reconfigure
Troubleshooting steps taken
Verified contents of gitlab.rb and ran:
gitlab-ctl reconfigure
gitlab-ctl restart postgresql
gitlab-ctl reconfigure
Ran database migrations on tracking database:
gitlab-rake geo:db:migrate
Refreshed foreign tables:
gitlab-rake geo:db:refresh_foreign_tables
No foreign tables in db:
gitlabhq_geo_production=# \det
List of foreign tables
Schema | Table | Server
--------+-------+--------
(0 rows)
Dropped SERVER and SCHEMA:
DROP SERVER gitlab_secondary CASCADE;
DROP SCHEMA gitlab_secondary;
We also saw the followoing:
SELECT count(1) FROM gitlab_secondary.projects; gives the error:
ERROR: No user mapping found for
gitlab-psql
and gitlab complains about tables foreign tables being out of date.
The reconfigure sets a user mapping with gitlab_geo but the server appears to be using gitlab-psql.
We disabled FDW by setting geo_secondary['db_fdw'] = false and running a reconfigure. The secondary node stiil shows as unhealthy and there's now a warning that FDW is not being used.