Geo discussion: Should Geo node primary/secondaryness be controlled by database records?
A few weeks ago, when I was chatting with @WarheadsSE about gitlab-org/build/CNG!220 (merged), there was a moment of totally understandable confusion that started kind of like this:
J: This container will start in a Geo secondary with
geo: enabled. How does it know it's in a Geo secondary?M: It finds a GeoNode record in the DB with the same name as its
Gitlab.config.geo.node_name, and if that record is marked "secondary", then it knows it's in a secondary.
Saying it out loud felt wrong. We have Omnibus roles like geo_primary_role and geo_secondary_role etc, but the significance of those roles is not what you'd think (not much significance). Primary/secondaryness matters a lot in the machine's configuration, e.g. the DB's need to be configured completely differently. This is evident during promotion/demotion.
It's cool that the DB is a SSOT for primary/secondaryness, but that's not helpful at all in at least 2 obvious cases:
- A machine's
node_namedoesn't match any GeoNode record's name - A machine's configuration's primary/secondaryness doesn't match its GeoNode record's primary/secondaryness
This has made me uncomfortable since then, but no great ideas to resolve this have occurred to me. I don't really know what the issue is exactly. So I don't know where to go with this but thought I'd voice the concern.
@WarheadsSE I'm wondering what your thoughts are. I feel like the mental model you started with might be a better model to work towards but I don't know what exactly that looks like. Spin up a cluster with geo_secondary: true configured, and it all works like a secondary?
Conclusion
It's fine to declare a site as primary or secondary in configs. We already do that. But keep in mind that it will add work to deprecate those configs when we move to use Consul as the SSOT for those values.