Skip to content

Add meta.root_caller_id for Sidekiq jobs

Thong Kuah requested to merge root_caller_id into master

What does this MR do and why?

The idea is that we want to distinguish which jobs ultimately comes from a Cronjob

  • Why? Because we want to do a controlled shutdown, we want to quickly any non-Cronjob initiated jobs before we decide to shutdown Sidekiq pods.
  • Why ? Because we want to minimise any non-Cronjob initiated jobs after we bring up the side
  • Why ? Because we want to validate with QA first. If QA fails then we have to roll back. We want to minimise any "real" data loss if we have to roll back

Related to gitlab-com/gl-infra/db-migration!181 (merged), gitlab-com/gl-infra/production#7064 (closed)

Note: We don't fill in meta.root_caller_id for web requests, as it's not needed currently

Screenshots or screen recordings

How to set up and validate locally

There are a few things to observe.

Cronjobs

  1. Simply observe log/sidekiq.log.
  2. You will eventually notice some Cronjobs run which in turn trigger other jobs. You can then see meta.root_caller_id:
  3. Grep for the correlation_id. All sidekiq jobs with that correlation_id should have the same meta.root_caller_id
{"severity":"INFO","time":"2022-05-19T00:00:31.883Z","retry":3,"queue":"default","backtrace":true,"version":0,"queue_namespace":"pipeline_default","args":["1050"],"class":"Ci::MergeRequests::AddTodoWhenBuildFailsWorker","jid":"05c48439e7f815627473d231","created_at":"2022-05-19T00:00:31.882Z","meta.caller_id":"BuildFinishedWorker","correlation_id":"928841d7708c8f5e19f7cb480c6afe93","meta.feature_category":"continuous_integration","meta.project":"gitlab-org/gitlab-shell","meta.root_namespace":"gitlab-org","meta.client_id":"ip/","meta.root_caller_id":"Cronjob","worker_data_consistency":"always","idempotency_key":"resque:gitlab:duplicate:default:de082e6dfc9235edd9f5fe6811df04c3f9745151c058b4cb06c23f5d4d4829fc","size_limiter":"validated","enqueued_at":"2022-05-19T00:00:31.883Z","job_size_bytes":6,"pid":86190,"message":"Ci::MergeRequests::AddTodoWhenBuildFailsWorker JID-05c48439e7f815627473d231: start","job_status":"start","scheduling_latency_s":0.000434}

Normal jobs

  1. The easiest way is to run a pipeline. It will trigger some Sidekiq jobs.
  2. In log/sidekiq.log, you can then see see meta.root_caller_id:
  3. Grep for the correlation_id. All sidekiq jobs with that correlation_id should have the same meta.root_caller_id
{"severity":"INFO","time":"2022-05-19T00:09:34.192Z","retry":3,"queue":"default","backtrace":true,"version":0,"queue_namespace":"pipeline_hooks","args":["1068"],"class":"BuildHooksWorker","jid":"152038142968abd249786d1a","created_at":"2022-05-19T00:09:32.225Z","correlation_id":"01G3CTBRV1QGF67GQHY85GZDJG","meta.user":"root","meta.project":"gitlab-org/gitlab-shell","meta.root_namespace":"gitlab-org","meta.client_id":"user/1","meta.caller_id":"Ci::InitialPipelineProcessWorker","meta.remote_ip":"127.0.0.1","meta.feature_category":"continuous_integration","meta.subscription_plan":"default","meta.root_caller_id":"Ci::InitialPipelineProcessWorker","worker_data_consistency":"delayed","wal_locations":{},"idempotency_key":"resque:gitlab:duplicate:default:8a9227dc7d726bc008e308898b4445ae2bcd24d2d2f70f86c429cb9657bab5e8","size_limiter":"validated","enqueued_at":"2022-05-19T00:09:32.225Z","job_size_bytes":6,"pid":86190,"message":"BuildHooksWorker JID-152038142968abd249786d1a: done: 1.965483 sec","job_status":"done","scheduling_latency_s":0.001313,"rugged_calls":1,"rugged_duration_s":0.001777,"redis_calls":1,"redis_duration_s":0.000363,"redis_read_bytes":10,"redis_write_bytes":312,"redis_queues_calls":1,"redis_queues_duration_s":0.000363,"redis_queues_read_bytes":10,"redis_queues_write_bytes":312,"db_count":45,"db_write_count":0,"db_cached_count":6,"db_replica_count":0,"db_primary_count":45,"db_main_count":39,"db_main_replica_count":0,"db_ci_count":6,"db_ci_replica_count":0,"db_replica_cached_count":0,"db_primary_cached_count":6,"db_main_cached_count":6,"db_main_replica_cached_count":0,"db_ci_cached_count":0,"db_ci_replica_cached_count":0,"db_replica_wal_count":0,"db_primary_wal_count":0,"db_main_wal_count":0,"db_main_replica_wal_count":0,"db_ci_wal_count":0,"db_ci_replica_wal_count":0,"db_replica_wal_cached_count":0,"db_primary_wal_cached_count":0,"db_main_wal_cached_count":0,"db_main_replica_wal_cached_count":0,"db_ci_wal_cached_count":0,"db_ci_replica_wal_cached_count":0,"db_replica_duration_s":0.0,"db_primary_duration_s":0.042,"db_main_duration_s":0.038,"db_main_replica_duration_s":0.0,"db_ci_duration_s":0.005,"db_ci_replica_duration_s":0.0,"cpu_s":0.029943,"rate_limiting_gates":[],"duration_s":1.965483,"completed_at":"2022-05-19T00:09:34.192Z","load_balancing_strategy":"primary_no_wal","db_duration_s":0.006113}

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Thong Kuah

Merge request reports