Skip to content

Opt PipelineSuccessUnlockArtifactsWorker for db health based throttling

What does this MR do and why?

This MR opts a worker to make use of the database based throttling which was enabled in #404898 (closed).

For the first iteration using the default 5 seconds as the defer time.

Ref: #406255 (comment 1489373734)

How to set up and validate locally

  1. Let's make a database indicator to return a stop signal explicitly for testing purpose.

    • Change HealthStatus::Indicators::AutovacuumActiveOnTable to return Signals::Stop.new(self.class, reason: "Test") from evaluate method.
    • This indicator is checked from sidekiq middleware to defer the job.
  2. gdk restart rails-background-jobs (if not restarted with the recent changes).

  3. Let's monitor the logs

    • gdk tail rails-background-jobs
    • tail -f log/database_health_status.log
  4. Execute Ci::PipelineSuccessUnlockArtifactsWorker.perform_async(1) from the console.

  5. We can see the Stop signal being logged in database_health_status logs (source).

    {
       "severity":"INFO",
       "time":"2023-08-01T18:33:24.744Z",
       "correlation_id":"c8202eb62f77bfcc71a309c11f61451e",
       "status_checker_id":"ae52ad97a70c760e979f9fb6",
       "status_checker_type":"Gitlab::SidekiqMiddleware::SkipJobs::DatabaseHealthStatusChecker",
       "job_class_name":"Ci::PipelineSuccessUnlockArtifactsWorker",
       "health_status_indicator":"Gitlab::Database::HealthStatus::Indicators::AutovacuumActiveOnTable",
       "indicator_signal":"Stop",
       "signal_reason":"Test",
       "message":"#\u003cstruct Gitlab::SidekiqMiddleware::SkipJobs::DatabaseHealthStatusChecker id=\"ae52ad97a70c760e979f9fb6\", job_class_name=\"Ci::PipelineSuccessUnlockArtifactsWorker\"\u003e signaled: Stop (indicator: Gitlab::Database::HealthStatus::Indicators::AutovacuumActiveOnTable; reason: Test)"
    }
  6. There should be 1 log entry with job_status: 'start' and other 'deferred' entries in rails-background-jobs logs.

    // job_status: 'start'
    {
       "severity":"INFO",
       ...
       ...
       "class":"Ci::PipelineSuccessUnlockArtifactsWorker",
       "args":[
          "1"
       ],
       "jid":"ae52ad97a70c760e979f9fb6",
       "created_at":"2023-08-01T18:33:14.253Z",
       "meta.feature_category":"continuous_integration",
       "correlation_id":"c8202eb62f77bfcc71a309c11f61451e",
       "meta.caller_id":"Ci::PipelineSuccessUnlockArtifactsWorker",
       "meta.root_caller_id":"Ci::PipelineSuccessUnlockArtifactsWorker",
       "worker_data_consistency":"always",
       "size_limiter":"validated",
       "scheduled_at":"2023-08-01T18:33:19.253Z",
       "idempotency_key":"resque:gitlab:duplicate:default:dcf3672aa152e94c72ff8bfdc828ef88a3c94df8b67eb7f31fce4a97ac127d1e",
       "enqueued_at":"2023-08-01T18:33:24.735Z",
       "job_size_bytes":3,
       "pid":17906,
       "message":"Ci::PipelineSuccessUnlockArtifactsWorker JID-ae52ad97a70c760e979f9fb6: start",
       "job_status":"start",
       "scheduling_latency_s":0.000361,
       "enqueue_latency_s":5.481952
    }
    
    // job_status: 'deferred'
    
    {
      "severity":"INFO",
      "time":"2023-08-01T18:33:24.746Z",
      "retry":3,
      "queue":"default",
      "backtrace":true,
      "version":0,
      "queue_namespace":"pipeline_background",
      "class":"Ci::PipelineSuccessUnlockArtifactsWorker",
      "args":[
         "12"
      ],
      "jid":"e64f6962f859be6a9ab56806",
      "created_at":"2023-08-01T18:33:14.247Z",
      "meta.feature_category":"continuous_integration",
      "correlation_id":"831bbaad50610631f2fd3661c0d1332a",
      "meta.caller_id":"Ci::PipelineSuccessUnlockArtifactsWorker",
      "meta.root_caller_id":"Ci::PipelineSuccessUnlockArtifactsWorker",
      "worker_data_consistency":"always",
      "size_limiter":"validated",
      "scheduled_at":"2023-08-01T18:33:19.247Z",
      "idempotency_key":"resque:gitlab:duplicate:default:ab135d55d565c2789d0b20d6bb6d528916626cb47e9b43519c2a5a047ca86491",
      "enqueued_at":"2023-08-01T18:33:24.733Z",
      "job_size_bytes":4,
      "pid":17906,
      "message":"Ci::PipelineSuccessUnlockArtifactsWorker JID-e64f6962f859be6a9ab56806: deferred: 0.013101 sec",
      "job_status":"deferred",
      "scheduling_latency_s":0.000454,
      "enqueue_latency_s":5.486388,
      ...
      ...
      "completed_at":"2023-08-01T18:33:24.746Z",
      "load_balancing_strategy":"primary",
      "job_deferred_by":"database_health_check",
      "deferred_count":1,
      "db_duration_s":0.0,
      "urgency":"low",
      "target_duration_s":300,
      "target_scheduling_latency_s":60
    }
  7. Reverting step 1 changes should again make the job perform without deferring.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #414843 (closed)

Edited by Prabakaran Murugesan

Merge request reports