ElasticSearch returns 500 for searches if one node in cluster is down

Summary

When using an ElasticSearch cluster, if at any point one of the nodes goes down GitLab will start serving 500s during searches due to the Connection refused by the node.

Steps to reproduce

  1. Configure an ES cluster
  2. Disconnect one of the nodes
  3. Search

What is the current bug behavior?

500 error

What is the expected correct behavior?

Be able to search without the node

Relevant logs and/or screenshots

ActionView::Template::Error (Connection refused - connect(2) for "prod-vm-elas1-clnt-01" port 9200 (prod-vm-elas1-clnt-01:9200)): 
56: = link_to search_filter_path(scope: 'projects') do 
57: Projects 
58: %span.badge 
59: = @search_results.projects_count 
60: %li{ class: active_when(@scope == 'issues') } 
61: = link_to search_filter_path(scope: 'issues') do 
62: Issues 
lib/gitlab/elastic/search_results.rb:37:in `projects_count' 
app/views/search/_category.html.haml:59:in `block in _app_views_search__category_html_haml__1025586116889026443_70334080998640' 
app/views/search/_category.html.haml:56:in `_app_views_search__category_html_haml__1025586116889026443_70334080998640' 
app/views/search/show.html.haml:6:in `_app_views_search_show_html_haml___2439701067896620392_70334037248940' 
lib/gitlab/middleware/multipart.rb:93:in `call' 
lib/gitlab/request_profiler/middleware.rb:14:in `call' 
lib/gitlab/middleware/go.rb:16:in `call' 
lib/gitlab/etag_caching/middleware.rb:10:in `call' 
lib/gitlab/middleware/readonly_geo.rb:30:in `call' 
lib/gitlab/request_context.rb:18:in `call'

GitLab Version

9.1.4-ee

https://gitlab.zendesk.com/agent/tickets/78701

Edited Jun 19, 2017 by Cindy Pallares 🦉
Assignee Loading
Time tracking Loading