Skip to content

pipeline stuck pending owing to PipelineProcessWorker called by Ci::PipelineBridgeStatusWorker being deduplicated - "dropped until executing"

Summary

Customer raised a ticket because a pipeline in their merge train got stuck in pending state. Link for GitLab team members.

  • The parent pipeline contains only a trigger job for a child pipeline.
    • resource_group is set on the customer's trigger job
  • They re-ran it, and it was successful.

Kibana logs (looking for jobs with an argument of the pipeline ID)

The log entries start with jobs initiated by MergeTrains::RefreshWorker

Looking at them side-by-side

  • There's a delay (see [1] below) otherwise ..
  • The same jobs got kicked off for both pipelines, up to a point
  • Two PipelineProcessWorker jobs behave differently.
  1. There's a PipelineProcessWorker initiated by Ci::ResourceGroups::AssignResourceFromResourceGroupWorker in both pipelines, but in the failed pipeline it takes about an hour for this to trigger. The successful one, it runs with minutes or seconds of all the other jobs kicked off by by MergeTrains::RefreshWorker

    • Nothing in the logs indicate why; timings look correct, no scheduling latency.
    • This doesn't affect the pipeline run time. On the contrary, end to end, the problematic pipeline runs faster - two hours instead of three - as measured from when MergeTrains::RefreshWorker kicks everything off to the second problematic PipelineProcessWorker ...
  2. When the PipelineProcessWorker is called by Ci::PipelineBridgeStatusWorker once the child pipeline has completed, in the pipeline that hangs, sidekiq deduplicates it. In the successful one, it executes, and there's a lot of other jobs that also run as a result.

PipelineProcessWorker JID-1e88797bef20e60017d10e95: deduplicated: dropped until executing

Possibly related:

  • #342123 (closed)
    • AssignResourceFromResourceGroupWorker is involved in running some of the PipelineProcessWorker jobs
    • deduplication.type dropped until executing
  • !71979 (merged) because the following elements came up in the investigation:
    • until_executed strategy
    • resource groups (resource_group is set on customer's the trigger job)
    • merge trains

Steps to reproduce

Example Project

What is the current bug behavior?

Some set of events cause the PipelineProcessWorker called by Ci::PipelineBridgeStatusWorker to deduplicate.

What is the expected correct behavior?

The PipelineProcessWorker called by Ci::PipelineBridgeStatusWorker executes, in this situation.

Relevant logs and/or screenshots

Output of checks

This bug happens on GitLab.com

GitLab Enterprise Edition 15.6.0-pre a334075e

Possible fixes

One user experiencing this issue was able to successfully use the following workaround:

  • Rebase the problem branch
  • New pipeline is created
  • Merge request is able to be successfully merged
Edited by Michael Lussier