Skip to content

Introduce instrumentation for all Elasticsearch HTTP requests

Dylan Griffith requested to merge 214279-es-instrumentation into master

What does this MR do?

This introduces the same pattern of instrumentation we use for Redis. This MR also adds the instrumentation details to sidekiq logs like we do with Redis.

This instruments at the lowest level by instrumenting the elasticsearch-transport gem which is used by the elasticsearch-rails gem and elasticsearch-api gem which are the higher level abstractions used by GitLab. This has been tested to log requests coming from both places.

Other options considered for this solution involved following a similar pattern to https://github.com/elastic/elasticsearch-rails/blob/1c3226d8186af6e8446fa5091f5b27760c29f896/elasticsearch-rails/lib/elasticsearch/rails/instrumentation/log_subscriber.rb which would extend the log subscriber pattern to be more than just searches in the context of the elasticsearch-rails gem but this seemed more convoluted than the approach I took since it requires monkey patching to publish a message then a separate class to subscribe and store those messages and so the pattern didn't seem to be beneficial unless you plan to have multiple subscribers.

Additionally I considered using Faraday middleware to track this since the Elasticsearch gem uses Faraday by default but again I opted not to do that since it seemed trickier to manipulate the client config following https://github.com/elastic/elasticsearch-ruby/tree/master/elasticsearch-transport#transport-implementations and implement the required middleware etc. than just prepending the method following the same pattern as Redis.

Follow up MRs for this will be:

  1. !32937 (merged)
  2. !32939 (merged)

Screenshots

search request log
{
  "method": "GET",
  "path": "/search/count",
  "format": "json",
  "controller": "SearchController",
  "action": "count",
  "status": 200,
  "time": "2020-05-26T05:44:17.867Z",
  "params": [
    {
      "key": "group_id",
      "value": ""
    },
    {
      "key": "project_id",
      "value": "18"
    },
    {
      "key": "repository_ref",
      "value": "master"
    },
    {
      "key": "scope",
      "value": "commits"
    },
    {
      "key": "search",
      "value": "searchable"
    }
  ],
  "remote_ip": "127.0.0.1",
  "user_id": 1,
  "username": "root",
  "ua": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36",
  "queue_duration_s": 0.851586,
  "correlation_id": "AYCIsTjmcs7",
  "meta.user": "root",
  "meta.caller_id": "SearchController#count",
  "gitaly_calls": 11,
  "gitaly_duration_s": 0.121262,
  "redis_calls": 7,
  "redis_duration_s": 0.006238,
  "redis_read_bytes": 526,
  "redis_write_bytes": 52816,
  "elasticsearch_calls": 1,
  "elasticsearch_duration_s": 0.013692,
  "cpu_s": 0.42,
  "db_duration_s": 0.00364,
  "view_duration_s": 0.00019,
  "duration_s": 0.22047
}
ElasticIndexBulkCronWorker job
{
  "severity": "INFO",
  "time": "2020-05-26T05:46:22.150Z",
  "queue": "cronjob:elastic_index_bulk_cron",
  "args": [],
  "class": "ElasticIndexBulkCronWorker",
  "retry": 0,
  "backtrace": true,
  "queue_namespace": "cronjob",
  "jid": "a60a31e069bdd376a08bb346",
  "created_at": "2020-05-26T05:46:21.287Z",
  "meta.caller_id": "Cronjob",
  "correlation_id": "4431644e8115072504e58b30474f7485",
  "uber-trace-id": "d96d912b691b04ff:d96d912b691b04ff:0:1",
  "enqueued_at": "2020-05-26T05:46:21.289Z",
  "pid": 56101,
  "message": "ElasticIndexBulkCronWorker JID-a60a31e069bdd376a08bb346: done: 0.756067 sec",
  "job_status": "done",
  "scheduling_latency_s": 0.105121,
  "redis_calls": 6,
  "redis_duration_s": 0.003488,
  "redis_read_bytes": 104,
  "redis_write_bytes": 628,
  "elasticsearch_calls": 1,
  "elasticsearch_duration_s": 0.130979,
  "extra.elastic_index_bulk_cron_worker.records_count": 3,
  "duration_s": 0.756067,
  "cpu_s": 0.381907,
  "completed_at": "2020-05-26T05:46:22.150Z",
  "db_duration_s": 0.001664
}
Sample of `detail_store` showing it capturing bulk API requests, searches and refreshes:
[{:method=>"POST",
  :path=>"_bulk",
  :params=>{},
  :body=>
   "{\"index\":{\"_index\":\"gitlab-test\",\"_type\":\"doc\",\"_id\":\"issue_1\",\"routing\":\"project_1\"}}\n{\"id\":1,\"iid\":1,\"title\":\"new issue\",\"description\":null,\"created_at\":\"2020-05-22T05:10:36.746Z\",\"updated_at\":\"2020-05-22T05:10:36.746Z\",\"state\":\"opened\",\"project_id\":1,\"author_id\":1,\"confidential\":false,\"assignee_id\":[],\"type\":\"issue\",\"join_field\":{\"name\":\"issue\",\"parent\":\"project_1\"}}\n",
  :duration=>0.099494,
  :backtrace=>
   ["lib/gitlab/instrumentation/elasticsearch_transport.rb:16:in `perform_request'",
    "ee/lib/gitlab/elastic/bulk_indexer.rb:126:in `try_send_bulk'",
    "ee/lib/gitlab/elastic/bulk_indexer.rb:109:in `send_bulk'",
    "ee/lib/gitlab/elastic/bulk_indexer.rb:43:in `flush'",
    "ee/app/services/elastic/process_bookkeeping_service.rb:74:in `execute_with_redis'",
    "ee/app/services/elastic/process_bookkeeping_service.rb:51:in `block in execute'",
    "lib/gitlab/redis/wrapper.rb:18:in `block in with'",
    "lib/gitlab/redis/wrapper.rb:18:in `with'",
    "ee/app/services/elastic/process_bookkeeping_service.rb:46:in `with_redis'",
    "ee/app/services/elastic/process_bookkeeping_service.rb:51:in `execute'",
    "ee/spec/support/helpers/elasticsearch_helpers.rb:6:in `ensure_elasticsearch_index!'",
    "spec/lib/gitlab/instrumentation/elasticsearch_transport_spec.rb:33:in `block (3 levels) in <main>'",
    "spec/spec_helper.rb:305:in `block (3 levels) in <main>'",
    "spec/support/sidekiq_middleware.rb:9:in `with_sidekiq_server_middleware'",
    "spec/spec_helper.rb:296:in `block (2 levels) in <main>'",
    "spec/spec_helper.rb:292:in `block (3 levels) in <main>'",
    "spec/spec_helper.rb:292:in `block (2 levels) in <main>'",
    "spec/spec_helper.rb:286:in `block (3 levels) in <main>'",
    "lib/gitlab/with_request_store.rb:17:in `enabling_request_store'",
    "lib/gitlab/with_request_store.rb:10:in `with_request_store'",
    "spec/spec_helper.rb:286:in `block (2 levels) in <main>'"]},
 {:method=>"POST",
  :path=>"gitlab-test/_refresh",
  :params=>{},
  :body=>nil,
  :duration=>0.039379,
  :backtrace=>
   ["lib/gitlab/instrumentation/elasticsearch_transport.rb:16:in `perform_request'",
    "ee/lib/gitlab/elastic/helper.rb:92:in `refresh_index'",
    "ee/spec/support/helpers/elasticsearch_helpers.rb:13:in `refresh_index!'",
    "ee/spec/support/helpers/elasticsearch_helpers.rb:9:in `ensure_elasticsearch_index!'",
    "spec/lib/gitlab/instrumentation/elasticsearch_transport_spec.rb:33:in `block (3 levels) in <main>'",
    "spec/spec_helper.rb:305:in `block (3 levels) in <main>'",
    "spec/support/sidekiq_middleware.rb:9:in `with_sidekiq_server_middleware'",
    "spec/spec_helper.rb:296:in `block (2 levels) in <main>'",
    "spec/spec_helper.rb:292:in `block (3 levels) in <main>'",
    "spec/spec_helper.rb:292:in `block (2 levels) in <main>'",
    "spec/spec_helper.rb:286:in `block (3 levels) in <main>'",
    "lib/gitlab/with_request_store.rb:17:in `enabling_request_store'",
    "lib/gitlab/with_request_store.rb:10:in `with_request_store'",
    "spec/spec_helper.rb:286:in `block (2 levels) in <main>'"]},
 {:method=>"GET",
  :path=>"gitlab-test/doc/_search",
  :params=>{},
  :body=>
   {:query=>
     {:bool=>
       {:must=>[{:simple_query_string=>{:fields=>["title^2", "description"], :query=>"hello world", :default_operator=>:and}}],
        :filter=>[{:term=>{:type=>"issue"}}, {:has_parent=>{:parent_type=>"project", :query=>{:bool=>{:should=>[[{:bool=>{:filter=>[{:terms=>{:id=>[]}}, {:terms=>{"issues_access_level"=>[20, 10]}}]}}]]}}}}, {:term=>{:confidential=>false}}]}},
    :sort=>[{:updated_at=>{:order=>:desc}}, :_score],
    :highlight=>{:fields=>{:title=>{}, :description=>{}}}},
  :duration=>0.253693,
  :backtrace=>
   ["lib/gitlab/instrumentation/elasticsearch_transport.rb:16:in `perform_request'",
    "(pry):4:in `first'",
    "(pry):4:in `block (3 levels) in <main>'",
    "spec/lib/gitlab/instrumentation/elasticsearch_transport_spec.rb:36:in `block (3 levels) in <main>'",
    "spec/spec_helper.rb:305:in `block (3 levels) in <main>'",
    "spec/support/sidekiq_middleware.rb:9:in `with_sidekiq_server_middleware'",
    "spec/spec_helper.rb:296:in `block (2 levels) in <main>'",
    "spec/spec_helper.rb:292:in `block (3 levels) in <main>'",
    "spec/spec_helper.rb:292:in `block (2 levels) in <main>'",
    "spec/spec_helper.rb:286:in `block (3 levels) in <main>'",
    "lib/gitlab/with_request_store.rb:17:in `enabling_request_store'",
    "lib/gitlab/with_request_store.rb:10:in `with_request_store'",
    "spec/spec_helper.rb:286:in `block (2 levels) in <main>'"]}]

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team

Closes #214279 (closed)

Edited by Yorick Peterse

Merge request reports