Configurable maximum number of pipelines in a merge train
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
Problem
Currently a merge train has a fixed limit of 20 pipelines [docs]:
Each merge train can run a maximum of twenty pipelines in parallel. If more than twenty merge requests are added to the merge train, the merge requests are queued until a slot in the merge train is free. There is no limit to the number of merge requests that can be queued.
For projects with large pipelines with a lot of stages, this can cause serious contention for resources and is particularly bad in cases where jobs are not interruptible. In these cases merge trains are effectively unusable if you don't have the CI runner capacity to execute 20 pipelines in parallel.
As an example: in our organization our main project with around 35 active developers has a pipeline with 90 jobs and can take approximately 40 minutes to complete. With our runner pool we can generally run 3-4 pipelines in parallel before the number of pending jobs starts increasing significantly).
We had originally built a bot before merge trains was released in order to queue MRs and automatically rebasing and merging them as required, it runs a single pipeline at a time to ensure CI is not saturated, however we would much rather be able to use the built in merge trains feature as it is better integrated into the merge request flow and would reduce our maintenance burden.
In order for us to be able to use merge trains, we'd like to be able to limit the number of parallel pipelines that can be run on a project level. It should be possible to set this as low as 1 (meaning no parallelism) such that each MR in the merge train is merged in sequence.
Proposal
This proposal is to make the maximum number of pipelines that can run in parallel in a merge train configurable.
Iteration 1:
- Introduce a plan limit which would allow:
- Self-managed instances can set the limit application wide as they see fit
- We can more easily adjust the application wide limit via Change request on .com (we should be careful about doing this without a project level limit also in place, because it can increase compute usage on runners for customers)
Implementation guide
Database change (should be in a migration)
add_column(:plan_limits, :max_pipelines_per_merge_train, :integer, default: 20, null: false)
Change the RefreshService to use the configurable limit:
diff --git a/ee/app/services/merge_trains/refresh_service.rb b/ee/app/services/merge_trains/refresh_service.rb
index d4da9ef8a808..5da5238baa11 100644
--- a/ee/app/services/merge_trains/refresh_service.rb
+++ b/ee/app/services/merge_trains/refresh_service.rb
@@ -9,7 +9,7 @@ module MergeTrains
# NOTE: To prevent concurrent refreshes, `MergeTrains::RefreshWorker` implements a locking mechanism through the
# `deduplicate :until_executed, if_deduplicated: :reschedule_once` option within the worker
class RefreshService
- DEFAULT_CONCURRENCY = 20
+ include Gitlab::Utils::StrongMemoize
def initialize(target_project_id, target_branch)
@target_project_id = target_project_id
@@ -21,13 +21,15 @@ def execute
train = MergeTrains::Train.new(@target_project_id, @target_branch)
- train.all_cars(limit: DEFAULT_CONCURRENCY).each do |car|
- result = MergeTrains::RefreshMergeRequestService
- .new(car.target_project, car.user, require_recreate: require_next_recreate)
- .execute(car.merge_request)
+ Project.find_by_id(@target_project_id) do |project|
+ concurrency = project.actual_limits.max_pipelines_per_merge_train
+ train.all_cars(limit: concurrency).each do |car|
+ result = MergeTrains::RefreshMergeRequestService
+ .new(project, car.user, require_recreate: require_next_recreate)
+ .execute(car.merge_request)
- require_next_recreate = (result[:status] == :error || result[:pipeline_created])
- end
+ require_next_recreate = (result[:status] == :error || result[:pipeline_created])
+ end
end
end
end
We also need to add documentation and test. Here is an old MR to follow as an example of a change to plan limits: !48955 (diffs)
Some files may be moved or changed from that MR.
Iteration 2:
- Introduce a project level limit (the plan limit always 'wins' over the project limit)
- Why?:
But another reason to have the limit is not to waste customer resources(minutes, self-managed runner compute), because as we start to get 20 MRs out it's very likely that something will fail within one of the MRs causing the entire train to refresh:
If a merge train pipeline fails, the merge request is not merged. GitLab removes that merge request from the merge train, and starts new pipelines for all the merge requests that were queued after it.
Customers may want something different depending on the stability of pipelines within-in the project(for instance how often are pipelines affected by 'broken-master'). It's something we should consider allowing customer's to set eventually as long as we have a 'protect the system' limit in place for .com.
Also from the customer created problem description:
In order for us to be able to use merge trains, we'd like to be able to limit the number of parallel pipelines that can be run on a project level. It should be possible to set this as low as 1 (meaning no parallelism) such that each MR in the merge train is merged in sequence.