https://gitlab.com/gitlab-org/gitlab-ee/issues/9186 introduces the concept of merge trains, but for the MVC we are only running them sequentially. To really reap the benefits of merge trains, we can optimistically build refs and run the pipelines in parallel, resulting in very fast merge train execution for scenarios where most pipelines are likely to succeed.
Intended users
All Development Teams
Proposal
Each MR that joins a merge train joins as the last item in the train, just as it works in the current state. However, instead of queuing and waiting, each item takes the completed state of the previous (pending) merge ref, adds its own changes, and starts the pipeline immediately in parallel under the assumption that everything is going to pass. In this way, if all the pipelines in the train merge successfully, no pipeline time is wasted either queuing or retrying. If the button is subsequently pressed in a different MR, instead of creating a new pipeline for the target branch, it creates a new pipeline targeting the merge result of the previous MR plus the target branch. Pipelines invalidated through failures are immediately canceled and requeued.
With this iteration, we also remove the problem with merge trains being one-at-a-time, therefore we can remove the option to enable merge trains and have this strategy be the default. The user should be able to select "one at a time" as the strategy if they want but this should not be the default. The reason we allow the strategy as a choice are for use cases such as:
For example, if four MRs are queued together, this is what the refs their pipelines build (in parallel) would look like:
MR1
MR1+MR2
MR1+MR2+MR3
MR1+MR2+MR3+MR4
It's important to note that, given this composition, it's clear that they must never be allowed to merge out of order, even if an earlier one somehow finishes earlier. Also, if any item fails, the train will need to be recalculated and restarted with the first non-failing item as the first in the train. For example, if MR2 (pipeline 2) failed (or is removed or canceled), all running pipelines for the merge train will be canceled as invalid, and a new one built containing:
MR3
MR3+MR4
In this scenario, MR1 will have already merged so it is no longer in play. MR2 is known to be broken, so is not added back to the merge train. MR2 could potentially resolve its issues, and queue back up as 3. MR3+MR4+MR2.
If the target branch is updated by someone directly, bypassing the merge train, all pipelines are recalculated, immediately cancelled, and restarted using adjusted refs. People should not be committing directly to the target branch if they are using merge trains, since it invalidates the whole thing. We won't block this in the case of emergencies, but it is definitely an exception and not normal use case.
Permissions and Security
We don't limit the maximum length of a merge train, although we will limit the maximum parallelization. For the MVC we set the parallelization factor to 4. In future iterations this can potentially be tuned or made configurable, though our desire is to limit configuration options as much as possible.
Documentation
We will need to update the merge trains documentation to describe how this new strategy works. We are also changing the default strategy from one-at-a-time, so users of that strategy will need to know how to change back.
Testing
Particular testing focus should be around making extremely sure that out of order merges do not happen.
What does success look like, and how can we measure that?
We should measure # of pipelines that are part of a merge train.
What is the type of buyer?
This feature is used by all developers.
Links / references
Slack: #f_merge_trains
Product Design
The pipeline for merge train should have the copy modified:
Example UI
Copy
Pipeline #000 failed for merge train into [target-branch]
Pipeline #000 passed for merge train into [target-branch]
Pipeline #000 passed with warnings for merge train into [target-branch]
Pipeline #000 skipped for merge train with [target-branch]
Other Potential Strategies
Generate All Outcomes
Another variation of Hybrid above would be to kick off pipelines for every possible combination of outcome scenarios in the merge train, for each pipeline. If there are 3 pipelines in the train, the 4th pipeline will start the following pipelines:
1+2+3+4 (in case all succeed)
1+3+4 (in case 2 fails)
1+4 (in case 2 and 3 fail)
4 (in case 1 2 and 3 fail)
.. and so on.
You'd probably need some depth limit at some point, but this would guarantee a timely result for whatever scenario occurs. For certain scenarios where compute is less expensive than broken time in the target branch, this is the optimal approach. It sounds unusual but we've had some customers where time is worth much, much more than money (and they have significant resources available) for whom this is the preferred strategy.
Uber-style dependency analysis
Implement analysis of relationships in code to determine pipeline order. This is an advanced technique and would require significant understanding of the code checked in to the repo. Details on Uber's implementation available at https://eng.uber.com/research/keeping-master-green-at-scale/
From my understanding, not much would change in the UI/UX for the parallel approach -- unless there is a change/decision made regarding the max. number of merge requests a train can support. In this case, we should be able to add additional interaction or information for the users.
Also, if this is configurable we will touch the settings page. @jlenny
There will have to be a max parallel. We don't have to make it configurable (for now) but we should pick a smart default and get feedback from there. Users will need to be able to choose between this and the (pre-existing) sequential strategy for merge trains.
From an engineering point of view, what are the constraints for multiple parallel merge trains? I would like to understand a bit what could be the impact when the have N merge running.
Users will need to be able to choose between this and the (pre-existing) sequential strategy for merge trains.
So, in any case, on top of the current settings (which merge strategy type to use) users would have to choose between Parallel and Sequential merge train options?
If you queue and start 50 MRs, and all 50 are running when the second one fails, 49 will need to be canceled and restarted. There's a risk (of using too much compute) vs. reward (if everything succeeds, it's very fast.)
I've added an initial line in the sand for parallization and length:
We need to control what the maximum merge train length is, and what the maximum parallelization is. For the MVC we will cap the maximum length of the merge train to 10 and a parallelization factor of 4. In future iterations these can be tuned or made configurable.
I'm open to disagreement about what the exact numbers are.
@jlenny Just to confirm, parallelization factor means a merge train can create up to four concurrent pipelines at once? So does this mean if the first MR's merged, the 5th MR's pipeline will be automatically triggered?
Yeah, so if there are 10 items in the merge train and parallelization is enabled, it won't start all 10 at once. It will only let a maximum of 4 at a time run, and when any of those finish the next waiting will be started (they still need to merge in the right order, of course.) This will prevent someone from accidentally (or maybe intentionally) creating dozens or hundreds of concurrent pipelines.
This functionality is great to have IMO. Engineering side also expected this kind of optimization and thus we designed our backend architecture to be compatible with this routine.
Running pipelines based on the probability sounds nice. Going further, this architecture could be baked with some kind of deep-learning approach. We can teach the queuing system by feeding some related data, such as pipeline status, duration, changed lines/files, etc.
However, our situation seems quite different from the statement. As we already have:
Pipelines for merged refs always run on a merge request and users cannot enqueue a merge request unless the pipeline succeeds. This means when user enqueued a change to a sequential queue, the change has already been ensured that it doesn't break the target branch's pipeline. The only possibility that breaks master branch comes from the other merge requests in the same queue.
When user enqueues a merge request, the system checks merge conflicts with the previously enqueued merge request. This means we already have so-called Conflict Analyzer in the statement.
We thrive reducing the cost of CI. The approach based on probability might speed up the total merge time however requires additional pipelines, thus we will spend more cost on the computation.
@jlenny - should we note that in this iteration we feel comfortable removing the second "merge train" option and having this become the standard behavior when you want to validate the post-merge results?
I imagine there's a UI component to viewing these MR changes? We probably need a parallel view of them if running in parallel mode.
Kras Backend L | Frontend S |
Shinya Backend L | Frontend L | We need to introduce an additional ref for this parallelism. This would be one of the biggest challenges.
Jason G Backend L | Frontend L |
Sounds cool to work on and incredibly complex. I would want to give this a size estimate of more than "Large".
Filipa Backend L | Frontend L |
Nathan Backend L | Frontend S | smiley
This issue is primarily a backend task, but there may be frontend changes that come up as a result of backend refactoring.
This is a carry over from an issue in the last milestone - so far, it has been involving a lot of refactoring, so it's possible we may uncover more things that need refactoring going forward.
We will need to update the merge trains documentation to describe how this new strategy works. We are also changing the default strategy from one-at-a-time, so users of that strategy will need to know how to change back.
@jlenny Are we're adding a new project-level option to let users go back to the one-at-time strategy? I hope not.
It's likely that we put this feature behind the feature flag in order for the safeguard on .com, so the flag can be used for this purpose IMO.
Why do you hope not? The one-at-a-time strategy is still valuable for people who want to ensure changes are deployed to production one at a time, for example. There will also potentially be other strategies to choose from in the future.
The one-at-a-time strategy is still valuable for people who want to ensure changes are deployed to production one at a time, for example.
At this moment, pipelines happen on merge trains are still in context of merge request. This means those pipelines cannot take a place of the target branch's actual pipelines which could behave slightly different (e.g. Deployment jobs have only: master keywords in gitlab-ci.yml and these jobs do not run on merge trains).
Regardless of the one-at-a-time strategy or concurrent pipeline runs, each merge request will be merged sequentially to the target branch and it creates an additional pipeline on the target branch.
@ayufan Regarding train ref creation, you mentioned that we still use refs/merge, however, I still think that we don't need to use it, but just can have refs/train for all. Here is what makes the most sense to me:
MR1 pipeline - ref: refs/mr1/train (consists of source sha and target sha)
MR2 pipeline - ref: refs/mr2/train (consists of source sha and refs/mr1/train)
MR3 pipeline - ref: refs/mr3/train (consists of source sha and refs/mr2/train)
Would you mind explaining your concerns on the above approach?
@dosuken123 Just to confirm your ask to @rverissimo
and create a mock for this? Thanks
Did you mean the orignal scope mock up for the issue?
For the MVC we will cap the maximum length of the merge train to 10 and a parallelization factor of 4. In future iterations these can potentially be tuned or made configurable, though our desire is to limit configuration options as much as possible.
I think we could consider having application_setting defining maximum parallel limit for merge train pipelines, and have a per-project setting to define your own. (edited)
I think these are two separate requests with different scope. Your thoughts?
@cstasik There are two requirements in this scope. 1. Parallelization 2. User can still choose one-at-a-time strategy (i.e. the one we implemented in the previous iteration).
Now, if we implement the following specification, clearly users will lose the way to go back to the previous behavior which is 2 in the above lists. So we need some kind of controls to switch back to the old strategy.
For the MVC we will cap the maximum length of the merge train to 10 and a parallelization factor of 4. In future iterations these can potentially be tuned or made configurable, though our desire is to limit configuration options as much as possible.
The best proposal to cover up these requirements is adding project-level (and application-level if possible) config to let users to change parallelization factor. We set default 4 to the all projects, however, if user wants to switch back to the old strategy, they can set 1. Sooner or later we need to add this parameter, so it's better to use this value for the strategy handling today.
@dosuken123 Reading through the Issue Scope one more time, I believe we included both asks you have listed.
The user should be able to select "one at a time" as the strategy if they want but this should not be the default.
And
For the MVC we will cap the maximum length of the merge train to 10 and a parallelization factor of 4. In future iterations these can potentially be tuned or made configurable, though our desire is to limit configuration options as much as possible.
Are we aligned? It's important in this iteration we keep to minimum change and not create robust admin view or bringing in complex logic as an after thought for improvement, basic function this release.
@rverissimo UX designs should reflect orignal scope. Can you confirm designs support both requirements? Also, aren't designs usually posted in the issue? If they are aligned, then we should resolve and gos back to UX Ready status with designs posted.
It's important in this iteration we keep to minimum change and not create robust admin view or bringing in complex logic as an after thought for improvement, basic function this release.
@cstasik This is the thing we all engineers always keep in mind, it's MVC so we don't want to scope in excessive functionalities. The proposal we made is a minimal functionality for accomplishing the MVC.
If we don't want to scope in leveraging parallelization factor in this scope, please remove the former requirement, which is
The user should be able to select "one at a time" as the strategy if they want but this should not be the default.
@mnichols1 @dosuken123 @rverissimo
Do you feel we are aligned on the solution intent and user experience? I have seen re-clarification needed a few times when explaining this including this conversation that we might not have clarity or the issue should be revised to make it clear. https://gitlab.slack.com/archives/CHLKE258E/p1560942231009900?thread_ts=1560939868.001800&cid=CHLKE258E There is no harm in us re-grouping on this complex Feature to ensure we are all confident the Issue is clear and the delivered explanation is clear.
@dosuken123 I believe are success metrics need refinement so it would be great to know your technical point of view of what might need to be tested by introducing this new functionality? @rverissimo @mnichols1 What are typical validations for user experience form your side?
We should measure # of pipelines that are part of a merge train.
Ideal state, is this would be driven off a user experience flow/ journey baseline but I know that is still ramping.
For the MVC we will cap the maximum length of the merge train to 10
@rverissimo I think we need to show a warning when merge requests cannot be added to a train due to the upper-limit. Users should not see "Start/Add to train" nor "Start/Add to train when pipeline succeeds" buttons in this case, and see a text something like "Currently, merge train is full of capacity. Please come back later to add a train".
We'd still need to show "Immediate merge" buttons in order to unblock urgent patches, however, the above two buttons should disappear IMO.
@rverissimo @jhampton If possible, let's defer adding design scope and iterate in a follow up release.
@dosuken123 This is a great observation and something we could have identified if we started with an experience mapping for the issue, so a learning we will include as a task as we open new issues. To cap scope and focus on completing for the release. I would prefer we solve this in the simplest manner and place the Warning information in documentation for %12.1 with a follow up issue for %12.2 to execute UI polish and adjustment.
think we need to show a warning when merge requests cannot be added to a train due to the upper-limit. Users should not see "Start/Add to train" nor "Start/Add to train when pipeline succeeds" buttons in this case, and see a text something like "Currently, merge train is full of capacity. Please come back later to add a train".
We'd still need to show "Immediate merge" buttons in order to unblock urgent patches, however, the above two buttons should disappear IMO.
Open to other suggestions to keep scope as planned besides documentation but the overall objective is to complete the scope committed to and iterate after.
If so, would you mind explicitly defining the state of merge buttons in merge requests, when a merge train reached out to the upper-limit? Do users see "Merge Immediate" button only regardless of that the project enabled "Allow Merge Trains" option?
I would prefer we solve this in the simplest manner and place the Warning information in documentation for %12.1 with a follow up issue for %12.2 to execute UI polish and adjustment.
Catching up on the discussion and bringing some ux insights below. Please let me know if I am following it correctly.
From current SSOT for this issue:
We need to control what the maximum merge train length is, and what the maximum parallelization is. For the MVC we will cap the maximum length of the merge train to 10 and a parallelization factor of 4. In future iterations these can potentially be tuned or made configurable, though our desire is to limit configuration options as much as possible.
Parallel vs. One at a time
For this MVC, we don't let users control parallelization factor via project settings. In the future we can include the ability to select which merge train strategy is in place, as well as setting the limit for the length of a train.
I believe as part of this MVC we should also add a helper text to the settings page, informing of the default value for parallelization of a merge train, as well as a link to the merge train docs. See proposed wireframe below:
Parallelization limit on MR widget
Once a merge train reaches the parallelization limit, the system should inform by presenting the correct information in the UI. Use case identified suggests that users should not have the ability to add a MR to a merge train once limit is reached, but it's still able to perform an immediate merge.
@dosuken123 questions:
Does the same happens for Start/add to merge train when pipeline succeeds?
Is there any pattern/behaviour that influences the existing merge train, given the fact users immediately merge?
@rverissimo , Just wanted to confirm, "merge immediately" shows when pipeline for merge request (or branch pipeline if disabled) goes green, but MR wasn't part of a train yet, right? In other words there is still no way to push MR without at least some pipeline going green for it (train'ed or not).
@redbaron1 Yes, at this point the merge request is not part of the merge train yet. Afaik user can merge immediately without the pipeline being green (start merge train when pipeline succeeds) https://gitlab.com/gitlab-org/gitlab-ee/issues/9186#ux-proposal. But it's a good question... @dosuken123 would you mind jumping in? Thanks
@rverissimo This is completely different from my understandings.
For this MVC, we don't let users control parallelization factor via project settings. In the future we can include the ability to select which merge train strategy is in place, as well as setting the limit for the length of a train.
We will remove the "Allow merge trains" option as the issue description states
With this iteration, we also remove the problem with merge trains being one-at-a-time, therefore we can remove the option to enable merge trains and have this strategy be the default.
With that, how can users switch the strategy between Parallel and One at a time?
Once a merge train reaches the parallelization limit, the system should inform by presenting the correct information in the UI. Use case identified suggests that users should not have the ability to add a MR to a merge train once limit is reached, but it's still able to perform an immediate merge.
This sounds weird. Users are using merge trains for avoiding merge immediately (which is the most primitive way, shouldn't be chosen at the first place and thus the buttons has red warning color in some cases), however, we're prompting users to choose merge immediately strategy after the merge train reached the upper-limit.
I think users should see a disabled "Add to train" button (with a tooltip saying "Merge Train reached upper-limit. Please come back again later" or something) and can still choose "Merge Immediately" button but should be hidden under the dropdown.
Does the same happens for Start/add to merge train when pipeline succeeds?
Yes, same.
Is there any pattern/behaviour that influences the existing merge train, given the fact users immediately merge?
Think what's going to happen when a merge request is merged immediately beside full of merge trains. Since the target branch is advanced, all merge requests will re-generate train refs and re-run pipelines, thus the merge train will take longer time to finish. In the worst case, if users keep merging immediately, merge trains will never finish.
and can still choose "Merge Immediately" button but should be hidden under the dropdown.
What are prerequisites for merge immediately to be active at all? Is there a normal pipeline for merge request still to be run for merge immediately to be active?
@dosuken123 Agreed that it seems strange to fall back to "Merge immediately" when the merge train reaches the maximum length. The primary purpose of this feature is to ensure stability in the target branch, and "Merge immediately" runs contrary to this goal in my opinion.
Is there a reason (either technical or UX-related) we need to enforce an upper limit on merge train length? In my opinion, the simplest and best solution is to not limit merge train length at all.
Is there a reason (either technical or UX-related) we need to enforce an upper limit on merge train length?
From backend perspective, this upper limit is quite important. Long train could re-create number of pipelines when target branch is advanced and this could affect the CI cost significantly (e.g. 100 merge requests on a train, then 99 pipelines can be recreated at once). Also, this is not a cheap operation so could affect our production load as well.
We'd like to ensure that users are using reasonable length of merge trains. I even think that this limit should exist in the previous iteration.
@dosuken123 Ah, I hadn't considered CI costs. But isn't this where the parallelization limit will kick in? So even if 1000 MRs were added to a merge train, only the first 4 in the queue will actually have merge train pipelines created? (Assuming a parallelization limit of 4.) I might be misunderstanding this.
In my opinion, the best user experience would be to allow users to add an infinite number of merge requests to the merge train, and the system can take care of optimizing the process to avoid excess CI costs.
But isn't this where the parallelization limit will kick in? So even if 1000 MRs were added to a merge train, only the first 4 in the queue will actually have merge train pipelines created?
This is good point. As long as we limit the parallelization factor, the worst case can be avoided. In that case, we can descope the upper limit from this issue and focus on the parallelization factor only.
Ugh, I am glad we're having a call to talk about this I took some comments into account when prototyping.
We will remove the "Allow merge trains" option as the issue description states
Gotcha! @dosuken123
I think users should see a disabled "Add to train" button (with a tooltip saying "Merge Train reached upper-limit. Please come back again later" or something) and can still choose "Merge Immediately" button but should be hidden under the dropdown.
In this case we use the dropdown button. Perhaps instead of having two different buttons here (one disabled, and another one to merge), we display the merge option as the first one in the dropdown, as well as on the button label. UI-wise, it'll be weird to have a disabled action on the button while still allowing users to click on it to perform other actions.
With this iteration, we also remove the problem with merge trains being one-at-a-time, therefore we can remove the option to enable merge trains and have this strategy be the default. The user should be able to select "one at a time" as the strategy if they want but this should not be the default.
If I understand this correctly:
Merge trains will be active by default
Parallel merge trains would always be active by default
Users should be given the option to choose between strategies: Parallel vs. One at a time
Agreed that it seems strange to fall back to "Merge immediately" when the merge train reaches the maximum length. The primary purpose of this feature is to ensure stability in the target branch, and "Merge immediately" runs contrary to this goal in my opinion.
I think users should see a disabled "Add to train" button (with a tooltip saying "Merge Train reached upper-limit. Please come back again later" or something) and can still choose "Merge Immediately" button but should be hidden under the dropdown.
@nfriend @dosuken123 I understand the worry here, but from a UX point of view it is odd to provide a disabled first option for the user or remove completely. I also want to avoid using yet another tooltip, in a way to minimize cognitive load in our interface.
Gray out any unavailable options instead of removing them: any items that cannot be selected should remain in view. For extra UX credit, consider showing a short balloon help message if users hover over a grayed-out option for more than a second, explaining why that option is disabled and how to make it active.
If disabled items are removed, the interface loses spatial consistency and becomes harder to learn.
If I understand your suggestions correctly, which I would like to avoid, this is what the UI would look like:
My suggestion, as described before, is to allow users to select the first available option in the dropdown without having to deal with a different context or interaction for how to merge. Users would see a dropdown menu, just like before, but instead of having to hover on it to figure out why it's disabled, we move the Merge immediately option to be the one users see on the button label. We add a helper text indicating that the Merge trains strategy is active but the merge train is at full capacity. Users can still merge changes immediately or wait until a position opens in the merge train. We also add a link to the docs for more information.
This message could be an info/warning alert, but here I suggest a simple helper text.
Would love to get also ux feedback on this @vkarnes@mnichols1. Have you worked with similar patterns before? Do you see this interaction being handled differently?
This is good point. As long as we limit the parallelization factor, the worst case can be avoided. In that case, we can descope the upper limit from this issue and focus on the parallelization factor only.
In other words, we won't enforce a limit on the length of merge trains and will always show the "Add to merge train" option.
We need to control what the maximum merge train length is, and what the maximum parallelization is. For the MVC we will cap the maximum length of the merge train to 10 and a parallelization factor of 4. In future iterations these can potentially be tuned or made configurable, though our desire is to limit configuration options as much as possible.
@dosuken123 I understand it will be changed to something like:
We don't to control what the maximum merge train length is, and what the maximum parallelization is. For the MVC won't will cap the maximum length of the merge train or parallelization factor. In future iterations these can potentially be tuned or made configurable, though our desire is to limit configuration options as much as possible.
I will update the description, please feel free to correct if necessary.
@rverissimo I think we will still need to cap the parallelization factor:
and focus on the parallelization factor only.
So maybe something like this?
We don't limit the maximum length of a merge train, although we will limit the maximum parallelization. For the MVC we set the parallelization factor to 4. In future iterations this can potentially be tuned or made configurable, though our desire is to limit configuration options as much as possible.
@nfriend @rverissimo Can we create an separate issue for maximum merge train length? It seems not an urgent item right now, but might be issued in the future. So, just in case.
Users should be given the option to choose between strategies: Parallel vs. One at a time
I originally suggested implementing radio button to leverage the pipeline execution strategy, however, I believe setting parallelization factor by numeric text box is more extensible than the radio-button.
This is because users can switch back to so-called one-at-a-time strategy by just setting parallelization factor to 1. To illustrate:
I originally suggested implementing radio button to leverage the pipeline execution strategy, however, I believe setting parallelization factor by numeric text box is more extensible than the radio-button.
@dosuken123 I agree here. If we want to allow users in the future to set a # for the parallelization, a radio button itself might not solve the problem.
@cstasik we will discuss this soon today, but we can already have a new issue if this is not part of the scope of this task. Can you have a look please?
Before introducing merge trains, people were creating a MR targeting master branch. Each MR didn't have a relationship between the other MR. This caused a lot of conflict and pipelines broken when merging things to master. This is the reason why we implemented merge trains.
With Merge trains people can queue merge requests, chaining and sequentially merging them. In merge train culture, everything in lined up in a correct order. After MR1 enters master, and master is green, MR2 will enter the train to be merged, and so on... It cannot merge until the previous pipeline succeeds.
The current strategy implemented is One-at-a-time. We create a pipeline only for the first merge request.
If the pipeline fails, the merge request will drop from the train, and maintenance has to fix it and requeue the train.
There is no UI to see the queue of the merge train. All the user knows is the position of the merge request in a train.
After we implement parallelization, we will see multiple pipelines running in a single train.
What we will implement with Parallelization:
Differently from the previous method, parallelization is done in only 2 steps. Adding to merge train, and running parallel pipelines. It's faster and efficient.
In this issue #11222 (closed) Jason suggests the user can go back and choose the previous (one-a-at-time) method. And cannot add a mr to the train if the capacity is full.
The parallelization factor for pipelines is between 1-10. The default is 4. Technically those are different concepts. The factor has to be implemented, user should not be able to set a very high number as it's dangerous for CI cost. That is why we allow a maximum of 10. On the other hand, limiting the capacity of the merge train to be increased does not affect the CI cost.
We can descope max 10. User can enqueue whatever number of merge requests in the merge train, but the pipeline runs at 4 at the max. Once there is a pipeline slot free, it will start the next pipeline that is waiting in line.
We should inform the user that the pipeline is at max capacity.
We still allow users to add (click the button), but we indicate that the action won't be performed immediately. User should not need to perform any other action, as the system will take care of placing the MR in the train eventually. (@rverissimo to add this to prototype)
Nathan: We previously showed the index in the pipeline, and it makes sense now to more it to the main MR widget. You can add as many mr to the train as you want, but you only create 4 pipelines. (@rverissimo to add this to prototype)
Shinya: I don't think this works well. Currently we show a position of the pipeline, but with the parallelization it's not sure whether the system created a pipeline or not.
Shinya: Can we have the parallelization in the project settings? We want users to leverage of the parallelization factor.
Rayana: Are we allowing users to switch from parallel to one at a time?
Shinya: Let's not implement that now and scope it in a different issue. We will allow users to set the range.
Nathan: We can show in the settings the option... Set your parallelization factor, from 1-10. We simplify the solution, as one-at-a-time is equal parallelization: 1.
Corrina: There are a lot of user flow possibilities in this change, I feel we should doing some ux research in the next iteration to understand the overall feature.
We should inform the user that the pipeline is at max capacity.
Is it important that users know that their merge train pipeline has or hasn't started? In my opinion, this is kind of an implementation detail that the user shouldn't have to care about. We already show the current position of the merge request on the merge train - is this enough info for the user?
I'm wondering if this bullet point is a hold-over from when we thought we were limiting the length of merge trains. Since we're no longer restricting the merge train length (in this release), is this bullet point still relevant?
That's a good point @nfriend. I will go over the issue and collect some of those questions for when we're ready to include UX research to clarify our assumptions. cc @loriewhitaker@cstasik
In my point of view, this information is a nice to have.
Also, just to clarify, will the application create the pipeline but not start it until it's possible to have a pipeline for the merge train? Because if we say the pipeline is created, but not started/enqueued, it should still show as created. I see some pipelines that have all stages like:
Is it important that users know that their merge train pipeline has or hasn't started? In my opinion, this is kind of an implementation detail that the user shouldn't have to care about. We already show the current position of the merge request on the merge train - is this enough info for the user?
The most important information (IMHO must-have) and missing today is which pipeline is the pipeline for merge train and which pipeline is the other types. While users are developing on the branch, they push commits and the other pipelines are created (These pipelines are most likely Pipelines for merged results or detached merge request pipeline). But technically, Pipelines for merge train has to be clearly distinguished from them, but current it's shown as Pipelines for merged results.
For example, there are two merge requests and users are seeing the following UI.
Merge Request 1
Merge Request 2
and, now I have a question for you.
Which position is the merge request in the merge train?
Which pipeline is Pipelines for merged results?
Which pipeline is detached merge request pipeline (i.e. branch pipelines)?
Also, just to clarify, will the application create the pipeline but not start it until it's possible to have a pipeline for the merge train? Because if we say the pipeline is created, but not started/enqueued, it should still show as created. I see some pipelines that have all stages like:
Application (Merge Train) will not create a pipeline if it reaches parallelization limit.
The most important information (IMHO must-have) and missing today is which pipeline is the pipeline for merge train and which pipeline is the other types.
@dosuken123 Good point. My first thought would be to show Merge train pipeline #123456789, instead of Pipeline #123456789. Although @rverissimo may have a better idea of how we should distinguish these.
The prototypes below display the state of a full merge train, when the user tries to add a MR. We still allow users to add (click the button), but we indicate that the action won't be performed immediately. User should not need to perform any other action, as the system will take care of placing the MR in the train eventually.
Please feel free to provide feedback . Let me know if there's any other use cases that need to be covered. I understood that we don't show the message/case for start merge train, as the index will be 0.
Add to merge train, with information about the train being full. User can still add the merge request to a train, but it won't start a pipeline until a slot is available.
Add to merge train when pipeline succeeds, with information about the train being full. User can still add the merge request to a train, but it won't start a pipeline until a slot is available.
Once it's added to the merge train, user sees an updated version of the widget section with the confirmation. Helper text shows same information about the train being full.
Once set to be added to the merge train when the pipeline succeeds, user sees an updated version of the widget section with the confirmation. Helper text shows same information about the train being full.
@nfriend to be honest this message didn't show for me when I was playing around with the UI in an existing merge request. You can see it below. But it shouldn't replace any of the existing messages, unless something changes in the background. Also, we don't know for sure which pipeline will be running before this one enters the train... correct? @dosuken123 If so, then yes, I think we need to update the message.
[Edit]: Actually, never mind. The example in my project shows start merge train. But in any case, I have two different merge requests still running the pipeline, one already set to start the train when the pipeline succeeds, but the other one doesn't show the correct button. It displays start mtwps instead of add to mtwps
@rverissimo This is because I haven't developed this message yet . I was just working on the MR to add this message a few days ago, so I thought I should call it out since I didn't see it in your prototypes above. Should I hold off on this MR until we hear back from the technical writers or should I keep moving forward with the message as it's described in the issue?
Thank you @vkarnes and @mikelewis ! This is a challenging one so thank you for jumping in on copy. @marcia would it be helpful to schedule a synchronous call to walk through this given the complexity?
We're just telling the fact that the Pipeline for merge train will not be created immediately, however, users would not care about that because what's important for them is that their merge request will be merged eventually. So they do not even revisit the MR again once they set Merge Train. If they revisit, that would be when something went wrong e.g. a merge request was supposed to have been merged but not yet. In addition, even if the pipeline is created, it doesn't mean the merge request will be merged immediately after the pipeline finished. It won't be merged until the previous merge requests have been merged. That said, the information about the pipeline limitation is less valuable than others.
@nfriend I'd say move forward with the current messaging. Let's ping @marcia on Slack for awareness.
We're just telling the fact that the Pipeline for merge train will not be created immediately, however, users would not care about that because what's important for them is that their merge request will be merged eventually.
Again, we want to provide users with the right feedback, at the right time. I think we can validate our assumptions here. Do we want to inform users that, somehow, the action they want to perform won't immediately:
Add a mr to a merge train
Add a mr to a merge train when the pipeline succeeds
I don't think the point here is to revisit the merge request to see if the pipeline has started or not, but to provide users with a sense of direct manipulation and help them visualize their actions.
@dosuken123 Do we know which pipeline is running "before" my current pipeline (enqueued, waiting to enter the parallelization)? Or because we don't know which pipeline is before, we shouldn't show the message below?
Do we know which pipeline is running "before" my current pipeline (enqueued, waiting to enter the parallelization)?
Yes
Or because we don't know which pipeline is before, we shouldn't show the message below?
I think we've not implemented the message yet. @nfriend can confirm. Also, I don't understand the message that it should be ... after the pipeline for #pipeline-id finishes?
Yes, this is not implemented yet. I think in the past we assumed that we would be able to tell which pipeline would finish first in order to let users know that their merge train would be formed on top of pipeline X. This seems redundant now though.
What is your suggestion for this? From what I understand now, we can remove part of the message displayed above and combine it with the situation for the parallel merge. But in some cases users will have only 1 pipeline running at a time (if set). So the messaging should be different.
My suggestion has already been described above https://gitlab.com/gitlab-org/gitlab-ee/issues/11222#note_187028871. I don't understand who (which persona) cares the pipeline capacity at the merge request level. This could be useful information for administrators who manages stream of release orchestration and CI-cost balances. Even though developers/maintainers saw it, it doesn't affect their decision on the merges as they know their MRs will be merged sooner or later.
Taking a look at the screen shot again. At the very least, we should remove You can also merge immediately. We're prompting users to take undesired method at the first place. In merge train culture, Immediate Merge should be only for emergency such as critical patch for production incident.
@jlenny , what happens when train was full, MRs piled up, then a slot frees? Does it form a new train with all outstanding MRs and try to catchup by scheduling TAIL of that train so that in a best case it cleans whole queue? On failure it could do git bisect like logic to find offender.
User can enqueue whatever number of merge requests in the merge train, but the pipeline runs at 4 at the max. Once there is a pipeline slot free, it will start the next pipeline that is waiting in line.
@rverissimo , isn't it a wasted opportunity? Merge trains assume that most MRs are fit for merging, when there is a new slot and there is a queue , LAST pipeline should be scheduled to maximize train speed.
@dosuken123 Can you provide your thoughts from @redbaron1 on the Merge train queue approach?
isn't it a wasted opportunity? Merge trains assume that most MRs are fit for merging, when there is a new slot and there is a queue , LAST pipeline should be scheduled to maximize train speed.
@cstasik , @dosuken123 , by starting tail pipelines you get high merge speeds without doing any parallelization at all. You introduce fixed delay of a single MR time for all queued MRs , but you get in return an infinite merge throughput, assuming all MRs are green.
@darbyfrey Can we look into doing an asynch demo on this? I think it's needed given the complexity. Also, we should time that with ample time to for correction to meet the %12.1.
Marcia: I think in the merge trains case, for the documentation, an illustration will help users understand the functionality better.
Rayana: There's a lot of edge cases, and we also plan to implement different strategies for merge trains. It will help us a lot internally to better understand this functionality.
Rayana: How do we do this illustration? Should designers provide it or someone else takes care of it?
Marcia: I can work with the illustrations we already have. Who can provide the flows?
Corrina: Shinya is the best person for this.
.
Corrina: We're doing parallel as default, and user should have the option to change to one-at-a-time.
Marcia: The best thing is to get someone from Release to do a demo of this feature, explaining how to use it, and giving examples of the capability.
.
Marcia: I am creating a new documentation directory for merge trains, under pipeline for merge results.
Marcia: Ideally I would have a total understanding of the feature to be able to review the copy. If you have any questions you can tag me in a merge request so I can review it. That way is easier.
To make sure we're aligned about the review for the current UX prototypes, we have decided to move forward with the proposed copy that we post in this issue (SSOT), and @marcia will review it once we have a merge request for the front-end. This way we also give her time to get familiarized with the merge trains functionality. @cstasik
@dosuken123 @jagood Based on progress so far, do you have an estimated time we could complete item 1 below - preference would be next week in time for @marcia to complete blog post ready items in time for %12.1 ?
Great, thanks, @cstasik! Yep, that's pretty much it.
I'm not sure about the release posts deadlines since we changed GL's deployment to auto-deploy last month, so, the sooner we have all ready (feature + docs + demo video), the better.
@dosuken123 please assign the MRs to me when ready for tw review.
If any of you have any questions or need any other help, pls, ping me any time
@dosuken123 @nfriend@cstasik please have a look at the comments above. I feel there's a lot of miscommunication going on related to this feature, and the current description is not the SSOT. Please leave your comments in this thread so we can update the SSOT.
We remove the merge trains option completely from the project settings
Problems: How can we indicate to users that Merge Trains is the default strategy once we remove it completely from the settings page? I believe this can become confusing once we start making references to merge trains in the application's UI.
Proposal: Keep the Merge Trains section in the project settings, so that user knows that this is the default strategy. User cannot turn it on/off, but is still able to visualize it as a merge option.
We don't limit the maximum length of a merge train, although we will limit the maximum parallelization
From current issue description:
We don't limit the maximum length of a merge train, although we will limit the maximum parallelization. For the MVC we set the parallelization factor to 4. In future iterations this can potentially be tuned or made configurable, though our desire is to limit configuration options as much as possible.
From my understanding, for the sakes of this MVC, we don't allow users to set the parallelization limit. This way, there is no way for users to go back to one-at-a-time. We don't allow users to enter any numeric value (1-4) that will in a way set the strategy.
Application (Merge Train) will not create a pipeline if it reaches parallelization limit
This comment https://gitlab.com/gitlab-org/gitlab-ee/issues/11222#note_187019127 is interesting when we talk about displaying (or not) a message for users to know the parallelization capacity at a merge train has reached its peak. For the sake of this MVC, we don't display any information regarding parallelization limit on the merge request widget UI.
We remove the merge trains option completely from the project settings
I commented on https://gitlab.com/gitlab-org/gitlab-ee/merge_requests/14429#note_188633546, and my proposal is reducing the number of technical information such as "up to 4 at a time" and having a simple description like "You can also use Merge Train for sequentially merging multiple merge requests (information-icon)" under the "Merge pipelines will try to validate the post-merge result prior to merging" option.
We don't limit the maximum length of a merge train, although we will limit the maximum parallelization
Yes, this is my understanding in 12.1 scope. @cstasik Please create a follow-up issue and update this issue description.
Application (Merge Train) will not create a pipeline if it reaches parallelization limit
Yes, I believe we don't need this information for now.
@ayufan@oswaldo We're currently facing a yet-another-techncial-difficulty about checking method for latestness of the pipeline for merge trains.
To explain the problem, let's say there are two merge requests in the merge train.
master <= | MR1 | MR2 |
Now, each MR has the following pipelines for merge train. (I also described the master branch's state for understanding this problem correctly)
Location
pipeline.ref
pipeline.sha
pipeline.target_sha
pipeline.source_sha
master (Non-MR)
refs/heads/master
sha-master-head-A
N/A
N/A
MR1
refs/mr/1/train
sha-mr1-train-A
sha-master-head-A (= HEAD(refs/heads/master))
sha-mr1-head-A (= HEAD(refs/mr/1/head))
MR2
refs/mr/2/train
sha-mr2-train-A
sha-mr1-train-A (= HEAD(refs/mr/1/train))
sha-mr2-head-A (= HEAD(refs/mr/2/head))
Let's consider three cases as the next move. The above table will be updated as the following:
Case 1: MR1 was merged into master branch.
Location
pipeline.ref
pipeline.sha
pipeline.target_sha
pipeline.source_sha
master (Non-MR)
refs/heads/master
sha-master-head-B
N/A
N/A
MR2
refs/mr/2/train
sha-mr2-train-A
sha-mr1-train-A (= HEAD(refs/mr/1/train))
sha-mr2-head-1 (= HEAD(refs/mr/2/head))
Case 2: MR1 was dropped from the merge train.
Location
pipeline.ref
pipeline.sha
pipeline.target_sha
pipeline.source_sha
master (Non-MR)
refs/heads/master
sha-master-head-A
N/A
N/A
MR2
refs/mr/2/train
sha-mr2-train-A
sha-mr1-train-A (= HEAD(refs/mr/1/train))
sha-mr2-head-A (= HEAD(refs/mr/2/head))
Case 3: An emergent patch has been merged into master.
Location
pipeline.ref
pipeline.sha
pipeline.target_sha
pipeline.source_sha
master (Non-MR)
refs/heads/master
sha-master-head-B
N/A
N/A
MR1
refs/mr/1/train
sha-mr1-train-A
sha-master-head-A (= HEAD(refs/heads/master))
sha-mr1-head-A (= HEAD(refs/mr/1/head))
MR2
refs/mr/2/train
sha-mr2-train-A
sha-mr1-train-A (= HEAD(refs/mr/1/train))
sha-mr2-head-A (= HEAD(refs/mr/2/head))
Now, the question is How can we make sure that each pipeline.target_sha is the HEAD of the previous pipeline.ref or is a commit which contains exactly the same contents of the previous pipeline.ref. The former check is relatively easy that you can do just pipeline.target_sha != project.repository.commit(previous_ref), however, the latter condition is tricky, which happens in Case 1.
Probably, We'd need to expand the first/second parents of previous pipeline.sha and current pipeline.target_sha and compare them. If it exactly matches (i.e. has the same parents), it can be considered as the latest state.
Any suggestions to address this problem are welcome.
@dosuken123 , in case 1 master pipeline.sha can also be 'sha-mr1-train-A' , that would be a fastforward of master branch to the train commit, which itself is an intended merge, would it help then?
or is a commit which contains exactly the same contents of the previous pipeline.ref
The tricky part here is that we don't have much control of what refs/heads/master might represent, it can be a squashed commit or a fast-forward, so fetching parent_ids might be misleading.
That said, shouldn't we consider re-creating / stopping the merge-train if either the source/target branch of any of these MRs changes or any action such as merge is taken at any MR within the train?
I wonder if we should ever allow a manual merge from happening at a MR within the train too.
Though, I might be seeing the problem with wrong lenses now.
@redbaron1@oswaldo Thank you for the input. Given MergeRequests::MergeToRefService doesn't support fast-forward merge today, maybe we can consider dropping fast-forward case from Merge Train as well. Pipelines for merge train always run on a merge commit via non-ff-merge.
That said, shouldn't we consider re-creating / stopping the merge-train if either the source/target branch of any of these MRs changes or any action such as merge is taken at any MR within the train?
I wonder if we should ever allow a manual merge from happening at a MR within the train too.
We'd not want to do that because some pipelines have already run in parallel. Unless we're 100% sure the pipeline is at the stale condition, we'd not want to recreate train ref nor pipeline for merge train as it will create many pipelines per action, which wastes the resources of Runners.
We do not encourage users to perform manual merge, but in case an urgent patch created for master, the system should allow the exceptional case. In this case, we allow merge train to re-generate train refs and pipelines.
it can be a squashed commit
My understanding is that squash is not supported in Auto Merge neither. For example, users cannot choose Squash option as a merge option. So, we can document that we only support no-ff today and the other merge methods (w/ squash or ff-merge) are not supported for now.
Pipelines for merge train always run on a merge commit via non-ff-merge.
That's fine, my point was that this very merge commit MR1 was run at can become refs/heads/master , so you don't need to create another merge commit, an thus case 1 becomes non issue - MR2 magically starts to target refs/heads/master
My understanding is that squash is not supported in Auto Merge neither. For example, users cannot choose Squash option as a merge option. So, we can document that we only support no-ff today and the other merge methods (w/ squash or ff-merge) are not supported for now.
We're still able to select Squash commits when merge request is accepted at the MR edit page and let it be, so it's possible to be a squash commit still, if I understand the question correctly :)
@marin Was wondering if your team has already started using merge trains and if there is any feedback. @rverissimo and I are very interested in internal customer feedback to know what needs to be modified for a great user experience
@ogolowinskiDelivery did not but I think we should roll this out to EE or CE repository which should benefit the whole eng. organization. I think this might be in Developer productivity bucket, so @rymai might have better insight in how to roll this out.