Support for Alternative Workflows when using Monorepo

Release notes

Support for Monorepo Workflows through Trunk Based Development

Problem to solve

For the past 10 years, the standard workflow for using git was a branching model. This was easy to explain to new users as they could create staging areas for changes to share with others while maintaining a clean primary branch for builds and releases. With products supporting new collaborative features around this branching model we have become familiar with PR/MR type workflows that leverage hooks to enforce things like security, compliance and code reviews before a merge can take place. This works great when talking about applications that are developed that have loose coupling and can be built independently of other repositories.

Today, we are seeing companies that want a model that is more conducive to an all-in-one repository that makes it easier for companies to build a single set of rules for collaboration, build and deploy capabilities. For many people, they call these monorepos as everything from source code to documentation to infrastructure definition is combined into a single repository. Because these repositories tend to be larger in size, a branching workflow does not always work because there could be 1000s of developers working on a single repository committing 10,000s changes per day across millions of lines of code in this repository. There are examples of this at large technology companies that are very public about this like Google, Microsoft, Netflix and Twitter to name a few.

For these larger, monorepo style repositories an older paradigm is very commonly used called trunk based development. This model requires more ownership on the developer as they are expected to validate a lot of their changes locally before they make their changes public. To implement this style of development, most tools will create a staging area with patches that are submitted by developers. As patches are accepted, they are inserted into the trunk (main) repository through a series of merge trains that knows how to assemble the code changes in the right order. This model was very common back when we used SVN, CVS and ClearCase and in the OSS world with Bugzilla.

I have seen this issue directly when working with the Spinnaker community as the product is made up of 12 different services that were forced to move to different repositories because vendors won't support single repositories larger than 5G. Because of this, it is difficult to search for issues, find the right location to make changes, create shared components, leverage coding best practices and simplified build scripts. It ends up becoming a dance that requires a lot of orchestration across multiple build scripts that leverage different packages as build dependencies creating fragmentation and risk around security and vulnerabilities. Because of this fragmentation, collaboration has been difficult causing the community to shrink which could happen just as easily inside an enterprise that is forced to do the same.

This issue is not to debate the advantages and/or disadvantages of one model over the other, but to discuss options where we can accommodate both workflows while maintaining performance and scale that our customers expect.

Proposal

There are a few ways we could technically accomplish this:

Partial Forks - #342158
Stacked Diffs - https://kurtisnusbaum.medium.com/stacked-diffs-keeping-phabricator-diffs-small-d9964f4dcfa6#.jd1kqorhb When a developer is working on a branch locally, they can push the diffs to staging area. This staging area would generate an MR for each "chain" of patches from a given developer. This keeps the object sizes in the repository small as you are only merging the patches once they are accepted rather than creating the objects up front in the repository to hold the changes. This also reduces the issue of people that create long running branches. This staging area could become very powerful when combined with our merge train concept.
Git Stash - Is there a way where we can use the git stash capability for a developer to publish a stash for review. You could use a Stash MR to review the changes and then apply the stash to repository.

These are the three solutions I could come up with, but hope there are others in the community that have some additional solutions as well.

Intended Users

This solution would involve every Persona from https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/

Metrics

cc:// @sarahwaldner @adawar @brianwald

This page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc.

Edited Nov 03, 2021 by 🤖 GitLab Bot 🤖