Indexing fails with `error: elastic: Error 429 (Too Many Requests)`

Summary

When enabling indexing on an instance a customer ran into an issue where ElasticCommitIndexerWorker workers started failing with error: elastic: Error 429 (Too Many Requests).

It doesn't seem like we are following elastic's recommendation.

Make sure to watch for TOO_MANY_REQUESTS (429) response codes (EsRejectedExecutionException with the Java client), which is the way that Elasticsearch tells you that it cannot keep up with the current indexing rate.

Enabling throttling mechanism was brought up during an interaction with elastic support in Figure out why Elasticsearch was OOM.

Customer ticket: https://gitlab.zendesk.com/agent/tickets/262214

Steps to reproduce

I was not able to reproduce it locally, though, possibly because I didn't have enough sidekiq threads.

Relevant logs and/or screenshots

"Flushing error: Failed to perform all operations\"\n2022/01/20 11:24:19 bulk request 20: error: elastic: Error 429 (Too Many Requests)\n2022/01/20 11:24:22 bulk request 16: error: elastic: Error 429 (Too Many Requests)\n2022/01/20 11:24:39 bulk request 21: error: elastic: Error 429 (Too Many Requests)\n2022/01/20 11:25:00 bulk request 31: error: elastic: Error 429 (Too Many Requests)\n2022/01/20 11:25:02 bulk request 32: error: elastic: Error 429 (Too Many Requests)\n2022/01/20 11:25:06 bulk request 26: error: elastic: Error 429 (Too Many Requests)\n2022/01/20 11:25:09 bulk request 27: error: elastic: Error 429 (Too Many Requests)\n2022/01/20 11:25:12 bulk request 28: error: elastic: Error 429 (Too Many Requests)\n2022/01/20 11:25:16 bulk request 52: error: elastic: Error 429 (Too Many Requests)\n2022/01/20 11:25:21 bulk request 49: error: elastic: Error 429 (Too Many Requests)\n2022/01/20 11:25:27 bulk request 58: error: elastic: Error 429 (Too Many Requests)\n2022/01/20 11:25:47 bulk request 67: error: elastic: Error 429 (Too Many Requests)\n","error_class":"Gitlab::Elastic::Indexer::Error"

Current workaround

Indexing concurrency

In order to decrease the indexing throughput you can configure Bulk request concurrency in the Settings -> Advanced Search section. By default it's 10, but you can put it as low as 1 to reduce the number of concurrent indexing operations.

Queue selector

This is an additional workaround if changing Indexing concurrency didn't help

We believe that we can use Queue selector to limit indexing jobs only to specific sidekiq nodes, which should reduce the number of indexing requests and hopefully solve or reduce the number of 429 (Too Many Requests).

  1. Pick one sidekiq node, which will be dedicated to indexing, and add these lines to /etc/gitlab/gitlab.rb. This will configure 1 process dedicated for indexing.
sidekiq['enable'] = true
sidekiq['queue_selector'] = true
sidekiq['queue_groups'] = [
  "feature_category=global_search"
]
  1. Save the file and reconfigure GitLab for the changes to take effect:
sudo gitlab-ctl reconfigure
  1. Add these lines to /etc/gitlab/gitlab.rb on other non-indexing nodes. This will configure 1 process for non-indexing background jobs.
sidekiq['enable'] = true
sidekiq['queue_selector'] = true
sidekiq['queue_groups'] = [
  "feature_category!=global_search"
]
  1. Save the file(s) and reconfigure GitLab for the changes to take effect:
sudo gitlab-ctl reconfigure
Edited Feb 01, 2022 by Dmitry Gruzd
Assignee Loading
Time tracking Loading