Skip to content

Change GitLab::LoadBalancing to use a native Rails implementation for connection handling

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

This spun-up after the !50959 (merged):

  • @ayufan started a discussion:

    This is OK-ish, but hacky.

    I wish we would move away from the LoadBalancing module and instead implement that using:

    • connection handlers of Rails
    • use connection_handling to indicate what type of access is needed.

    The Rails would then automatically configure caching and handle fetch and release of all connections.

    Like this:

    ActiveRecord::Base.connection_handlers = { writing_role => ActiveRecord::Base.default_connection_handler }
    
    # register all other LB readers

    Then use:

    ActiveRecord::Base.connected_to(database: { readonly_slow: :animals_slow_replica }) do

    This additionally allows us to configure models to use a replicas by default:

    connects_to database: { writing: :primary, reading: :primary_replica }

    Or configure a per-thread connection handler for the current execution:

    ActiveRecord::Base.connection_handler = :primary_replica

    More info here:

We could remove a lot of legacy baggage if we would remove LoadBalancing and instead a built-in mechanism of Rails for Load Balancing. This offers signifcantly more flexibility and reduces a lot of complexity from GitLab codebase.

See also

Features we need to support

  1. User stickiness. User's should remain connected to the primary for reads immediately after they've performed a write. The stickiness should persist for a period of time based on session (not request) store.
  2. Allow Sidekiq jobs to opt in to using replicas
  3. Separation of writing/reading connections. Current module uses "guessing" to figure out whether we need a "primary" connection. Will that still work with Rails implementation or do we need to explicitly request the write (or read) connections always which could mean updating a lot of code?
  4. Automatic load balancing of replicas. Currently we (I assume) round-robin between replica connections. How does Rails handle this and how would we implement it ourselves?
  5. Service discovery for DNS to locate the hosts. How does Rails behave when the DNS record gives multiple IP addresses for a replica? Does it obtain a connection for each? Does it also support SRV records we use?
  6. We'll want to ensure all our important logging/metrics we are keeping track of in the current implementation are the same in the new implementation or we'll need to update monitoring tooling.
Edited by 🤖 GitLab Bot 🤖