Skip to content

Rediscover postgres service discovery load balancer once ttl expires

Simon Tomlinson requested to merge load-balancer-dns-refresh into master

What does this MR do and why?

Instead of memoizing the resolver, we re-fetch based on the TTL from DNS server.

This avoids caching the consul IP and help us avoid incident like this.

How to set up and validate locally

NOTE: Instructions for setting up Db load balancing with service discovery in local environment (gdk) can be found here

configuration = Gitlab::Database::LoadBalancing::Configuration.for_model(::ActiveRecord::Base)
load_balancer = Gitlab::Database::LoadBalancing::LoadBalancer.new(configuration)

# Changing the nameserver from localhost to a custom one, to get "ip_address_from_dns" in resolver instead of "ip_address"
sd_params = configuration.service_discovery.merge(nameserver: 'registry.gitlab.com')
sd = Gitlab::Database::LoadBalancing::ServiceDiscovery.new(load_balancer, **sd_params)

resolver = sd.resolver
old_resolver_object_id = resolver.object_id
sd.instance_variable_get(:@nameserver_ttl)

# To mock the TTL in past
sd.instance_variable_set(:@nameserver_ttl, 1.minute.ago)
resolver = sd.resolver
new_resolver_object_id = resolver.object_id

old_resolver_object_id == new_resolver_object_id
> false # as it refetches the resolver object (at this stage any rotated IP address will be used for further lookup)

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Closes #384348 (closed)

Edited by Prabakaran Murugesan

Merge request reports