Skip to content

Add circuit breaker for zoekt nodes

John Mason requested to merge jm-zoekt-circuit-breaker into master

What does this MR do and why?

Whenever an operation against a zoekt node fails, we trigger a backoff which will exclude that node from searches until the backoff expires. This implements an exponential backoff strategy. When all zoekt nodes are in a backoff state, the circuit breaker is tripped and zoekt integration is disabled completely until at least one node's backoff period expires.

It's important to note that as long as a single node is operational, the circuit breaker is not tripped and will still perform zoekt searches.

Related to #393445 (closed)

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

Before After

How to set up and validate locally

  1. Configure local zoekt environment
::Feature.enable(:index_code_with_zoekt)
::Feature.enable(:search_code_with_zoekt)
::Feature.enable(:zoekt_node_backoffs)
zoekt_node = ::Search::Zoekt::Node.find_or_create_by!(index_base_url: 'http://127.0.0.1:6080/', search_base_url: 'http://127.0.0.1:6090/') { |n| n.uuid = SecureRandom.uuid }
namespace = Namespace.find_by_full_path("flightjs") # Some namespace you want to enable
::Zoekt::IndexedNamespace.find_or_create_by!(node: zoekt_node, namespace: namespace.root_ancestor)
  1. Go to flightjs and perform searches. Notice that Exact code search (powered by zoekt) is on right side on top of search results

  2. In the console, simulate some failures by manually triggering a backoff. This example will trigger the circuit breaker and disable zoekt for roughly 16 seconds

4.times { zoekt_node.backoff.backoff! }
  1. Zoekt node's backoff should be enabled. You can verify the remaining backoff time by running zoekt_node.backoff.seconds_remaining. You can run this multiple times. It should decrease over time.
  2. Go to flightjs and perform searches. Notice that Exact code search (powered by zoekt) is no longer there. We did a fall back to using Elasticsearch if enabled or basic search using the DB otherwise.
  3. Wait until zoekt_node.backoff.seconds_remaining reaches zero and backoff expires.
  4. Go to flightjs and perform searches. Notice that Exact code search (powered by zoekt) is on right side on top of search results
  5. Zoekt node's backoff should be disabled. You can verify by running zoekt_node.backoff.enabled?. It should return false

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by John Mason

Merge request reports