Skip to content

Add SQL-based election for shard primaries

Stan Hu requested to merge sh-improve-sql-leader-election into master

This commit adds the following strategy to enable redundant Praefect nodes to run simultaneously:

  1. Every Praefect node periodically (every second) performs a health check RPC with a Gitaly node.

  2. For each node, Praefect updates a row in a new table (node_status) with the following information:

    1. The name of the Praefect instance (praefect_name)
    2. The name of the virtual storage name (shard_name)
    3. The name of the Gitaly storage name (storage_name)
    4. The timestamp of the last time Praefect tried to reach that node (last_contact_attempt_at)
    5. The timestamp of the last successful health check (last_seen_active_at)
  3. Periodically every Praefect node does a SELECT from node_status to determine healthy nodes. A healthy node is defined by:

    1. A node that has a recent successful error check (e.g. one in the last 10 s).
    2. A majority of the available Praefect nodes have entries that match the two above.
  4. To determine the majority, we use a lightweight service discovery protocol: a Praefect node is deemed a voting member if the praefect_name has a recent last_contact_attempt_at in the node_status table. The name is derived from a combination of the hostname and listening port/socket.

  5. The primary of each shard is listed in the shard_primaries. If the current primary is in the healthy node list, then no election needs to be done.

  6. Otherwise, if there is no primary or it is unhealthy, any Praefect node can elect a new primary by choosing candidate from the healthy node list and inserting a row into the table.

#2547 (closed)

Edited by GitLab Release Tools Bot

Merge request reports