For historical reasons, the collaboration model in GitLab has followed a repo-centered approach. A GitLab project is essentially a wrapper around a single repo, and all the powerful collaboration features are based within that project. This is great for teams that work within a small number of repos. A typical use case is that there is an idea, and somebody creates an issue with respect to that project-repo. This provides powerful integration features with code review / merge requests, and the CI/CD process at the tail end. However, there currently is a gap in GitLab where many organizations have a team-centered or product-centered approach that spans many repos. We ourselves experience this problem as a slight annoyance with CE+EE. For many of our customers, this is more pronounced and they have been requesting features to alleviate the pain. So far, our response is to offer the Group paradigm to mash projects together. But this is not sufficient, because it still forces our users to think in projects/repos, when software product development should start at the user (not at the technology stack). Since our vision is getting ideas and shipping them to production quickly, we need a strong team-centered / product-centered approach to collaboration.
The scope of this issue is to have a high-level discussion of the problem. And then to think about our strategy to tackle it. It may involve some major changes to GitLab as a whole, or introduce some radical new paradigms. Or perhaps is there an iterative path to get us to a stage that can solve team-centered use cases? And do we want to abandon / sunset or at least reduce the promotion of a repo-focused workflow? The output of this issue should be a set of principles and common use cases of folks using GitLab from end-to-end. We should also have a direction and strategy, and commentary from our engineering team on if/when and how to execute any major technical hurdles.
As we scale up and truly build end-to-end tools for our customers, this is a very critical time to nail down the right paradigms. In particular, issue boards are just ramping up. The board paradigm is extremely powerful and will allow us to majorly disrupt other issue management trackers in the project management space. So it's important that we introduce the right concepts to our customers, so that they can truly abandon JIRA, Trello, etc., and use GitLab as a single solution.
These are typical scenarios in enterprises which highlight the existing gaps of GitLab. One salient observation is that the software industry is moving away from monolithic systems and toward microservices. Many organizations are breaking apart their big systems into many many small microservices, sometimes even over 100 to support a business with many different offerings. (This may be a blind spot for us here in GitLab, where we ship a single, installable product.) It would not be strange for an organization to have over 100 repos, each for a single microservice.
In this scenario, there is one (fairly technical) product manager (PM), one engineering manager / engineering lead, and a handful of backend engineers. They are responsible for the platform stability of 5 microservices. They develop APIs. They ensure performance. They do a lot of the tech debt clean up. Because of the microservice architecture, they can run quickly and independently without interfering with other teams (as long as the API remains the same). The PM may want to ship a new feature that improves the overall performance for the entire platform of the 5 microservices. Here, the PM creates a ticket in the issue tracker. But the PM should not necessarily care or know what repos it impacts at the time of issue creation. Later on, the tech team digs into the ticket, and figures out what code needs to be written, and where. All those subsequent code changes (merge requests) should then be linked to the ticket somehow. But the ticket itself is not by itself associated with any particular repo. The tickets of this team are with respect to the team, not particular repos.
In this scenario, there is one product manager, one engineering manager / engineering lead, a handful of engineers (backend, frontend, mobile, full-stack), and maybe a few designers. The engineers do not "own" any particular codebase (as opposed to the previous scenario). Instead, they work on any piece of code (with help from the platform teams as experts). Here, again the PM and designers decide to create a new ticket to ship a feature. The PM doesn't know or care too much about the detailed technical implications. The PM just creates the ticket. Later on, the engineering team massages the ticket, offering opinions, especially technical ones, before digging into the work. It may turn out that that one particular ticket touches 10 different repos. And maybe the PM will then chop up the ticket into multiple tickets to release more strategically. But in any case, the tickets, and the entire collaboration process, is team-centered and product-centered. Only the code review and CI/CD stages are repo-centered. But these are back-integrated into the user-focused tickets themselves. It doesn't make sense to force the tickets to be associated with any repos.
Tracking bugs can be especially difficult. It be first reported by a customer externally. It may be logged in a support ticket system like ZenDesk. Then the ticket may need to flow through to different teams, because the root cause cannot be determined easily. The teams could be working on entirely different technologies (operating system, middleware, app layer, etc.) (though still in software, for our initial use cases at least) throughout the organization. How do we handle this situation? Should there be a single ticket? Multiple tickets linked together?
As discussed with @JobV : This is a high-level concern I have, but extremely relevant as we make GitLab awesome for organizations and disrupt the issue tracker space.
No specific detailed next steps yet. Wanted to get everyone's input and see what good ideas you have. A knee-jerk reaction is to implement group issues. I.e. issues scoped to groups. But I want us to be more thoughtful and considered here.
Thanks @victorwu. Some great thoughts here and things to address. There might be an opportunity here to work with @sarahod on confirming the scenarios and needs with customers (or potential customers).
I think there are more scenarios we may need to consider. And I know you talked about companies moving to microservices -- but I think we even have complications at the monolithic scale. For example, consider an enterprise that owns multiple layers of the stack. Talking specifically from my experience, you have teams working on the runtime and compiler, on the core OS, on the front end of the OS, and on an app the runs on the front end of the OS. A bug comes in. It isn't really clear where the bug is occurring. Maybe initially it seems like it is at the app level. But as you dig into it, there are multiple dependencies on other layers of the stack that other teams own that are causing it. Maybe there was even a bug on the compiler for the compiler of the app (an actual story I've heard). This issue/bug/ticket has now sprawled out into many repos and code branches across divisions of a large company. Now this is an extreme complicated case, but I'd like to push on and validate the scenarios a little bit to make sure we are going after the right goals.
Will have to think on this more - but thanks so much for writing this up!
Thanks @awhildy for the feedback. I've added your scenario to the description. I like it because it's a definitely one of the more extreme cases, and helps us understand the wide scope of the problem. Gives us a sense of scale and opportunities.
I'd like to push forward quickly with issue boards. But let's tread very carefully so we don't over-commit and make decisions that will force us into a corner later. For example, group level issue boards is a hotly requested feature to solve some of these scenarios. But I'm not convinced it's the best one long-term for GitLab. Feel free to dump more comments here in general. Let's sync up in a week or two on some large and medium things to tackle.
A bug comes in. It isn't really clear where the bug is occurring. Maybe initially it seems like it is at the app level. But as you dig into it, there are multiple dependencies on other layers of the stack that other teams own that are causing it.
+1 on this, this is a case even we have, with a pretty much monolithic app. It would be much worse if we had more projects, and we'd feel this pain more.
Still, I'd love to keep GitLab issues as lightweight as possible. I really don't want to end up just building JIRA, which is great for what it is, but not my tempo
A couple of thoughts.
Google has one repository: https://news.ycombinator.com/item?id=7020584
To a great extent it's a developer habit to split dissimilar codebases into different repositories (and so, in gitlab as-is, into different projects). It doesn't have to be that way. There is an open MR to absorb gitlab_git into the gitlab-ce repository, for instance, and there's no compelling reason why gitlab-runner, gitlab-workhorse, gitlab-pages, etc, could not also be in that repository. Then all the issues are in a single place, and one moves to labels to track them down.
If you have a repository per microservice, and you have hundreds of microservices, it's going to get painful quickly. So how far can we get by changing developer minds on this?
On the other end of the scale, gitorious allowed multiple repositories per project. This breaks the (very developer-focused) idea of a project and a codebase being one and the same thing, and allows many of the things we're currently pushing into the group level (milestones, labels, etc) to remain at the project level instead.
Thanks @nick.thomas for the insightful comments and the reference as well. Indeed we have a hard balancing act to 1) Be opinionated in our product, 2) Attract many different organizations with different processes, and 3) Harmonize the first two by influencing behavior in our customers.
I've thought about the multiple repos per project concept. And I personally would like that solution rather than groups of projects to solve most use cases. This is more gut instinct at the moment.
Mmm, to me, groups are principally about access control. What happens if we take to calling them "teams" instead?
Pushing project management features into them is a way to satisfy people who have fairly complicated projects, but aren't following a google-like repository philosophy (those with simple projects aren't suffering, of course!)
I'd keep the project management stuff in projects, keep groups for access control, and just add more repositories. Projects actually already have two git repos - there's a wiki as well as the project repo - so it's not too much of a stretch.
@JobV + @awhildy : What do you think about @nick.thomas 's proposal for having multiple repos in projects, to solve many of the use cases outlined in this issue? I whole-heartedly agree that "projects" are for project management, as the name implies! And "groups" being just access control seems very natural, and just from some googling, that is also the historical context of it. (Weren't they called "Teams" or "Organizations" before?)
At this point, my primary concerns are:
1) We have existing customers. They are asking group-level issue boards and many of these team-centered use cases. Even if we build projects that support multiple repos, will they adopt it as designed? For example, we could roll out projects and project issue boards that cuts across multiple repos. And it would totally rock. But would many existing customers complain about migration pain?
2) We should to get moving in this direction and have to make some decisions soon. If we want to strategically make a splash in the issue board space, we really need to have a clear path and start moving quickly.
@JobV : What's the urgency in this regard? If we can't establish a clear vision and strategy for team-centered collaboration, I'd prefer we do not focus on team-centered issue boards, and instead build out the per-project features first, and defer until later. If team-centered issue boards are indeed urgent, let's make some decision, even though it may be wrong in the long term. I'd rather we be "wrong", but be intentional in shipping and catching the wave of customers we want.
@nick.thomas : Can you comment on the technical feasibility and any major hurdles you foresee in allowing multiple repos per project?
From a user-facing, product-facing standpoint, it seems absolutely do-able. Nothing would be broken. All issues, merge requests, etc. would still look and behave the same, and all our existing customers would not notice a difference until they needed that 2nd, 3rd, etc. repo. So product-wise, we would just need to build out all the additional views to support multiple repos and those navigation elements. But I don't see anything that would be fundamentally broken, in terms of UI and user's expectations.
I like the idea of having multiple repositories in the same project, and if there's exactly one repository, the UX should stay the same. I don't "feel" this would be hard technically, but I would guess that tweaking the UX for multiple repositories might be challenging, as sometimes we might want to read merge requests or issues individually amongst different repositories, but sometimes we still want to have an overview across repositories. Maybe a filter would work, but I don't know.
@victorwu the main problem is that a whole bunch of things which are currently project-level have to become repository-level. One example is permissions - it would be perfectly reasonable to want some repositories in a project to be public, and others private, although that could be a second iteration change.
Commits and branches would need scoping by repository, and that's a bit of a challenge. Not insurmountable, but the UX would need careful thought.
Something to consider: even with this concept of projects, is there still a use case for multi-project issue boards?
I think there might be. For instance, a CTO might be interested in the progress of a number of projects and want to see them all together to aid resourcing decisions.
@victorwu urgency is only driven by our ambitions.
I do believe in multi-project (group level) issue boards, even if we were to implement any of the proposals here. It's nice to have an overview. But if we think that it makes sense to first develop further features before going to that level, I'm fine with it.
The idea of having multiple repo's per project is not unique. I haven't spent much time thinking about it. I do think many things would and will break if we go that route (permissions, protected branches, CI, etc), so it might be too late to do that. I know that Deveo (http://deveo.com/) does this and it gives them the ability to support Android repos, for instance.
What is the benefit of having multiple repos per project vs. having multiple projects (where project = repo) per group? Especially in a world with nested groups? Just trying to understand the meaningful differences. Like I think you still need issues, boards, etc. at both levels. Not sure what multiple repos per project gets you that you don't get with nested groups. Thanks!
Multiple repos allows the project and its issues to be be at the abstraction level of features (and not code). And so any one issue can then be translated into and related across multiple repos in that one project. In this scenario, you can build one issue board of issues from just one project. The primary benefit I see is that an issue is truly project/team-centered (when you first conceive of it), and then later on, you figure out how it should be created across the multiple repos. But you've already read that several times above already.
I recognize with group labels there is already inertia to head towards using a group to house many projects. And with nested groups, that provides more expressive power, finer grained permissions, etc. So we should work with and build on what we have. We would need to have a super significant reason to take another approach and force ourselves to backtrack in anyway, technically, design-concept-wise, etc., because it would be a huge cost right now to do so. Realistically I see this is the path forward. But I wanted to at least entertain the idea of multiple repos that Nick brought up earlier to make sure weren't missing anything.
Back to more of the focus of this issue about team-centered collaboration, I think it will be instructive as we figure out the issue board use cases in this team-collaboration context. Maybe we don't have to worry about this just now, but take this as an example of why I think this is a problem we have to solve very carefully: Suppose you have a group issue board already with 5 projects. And then you go ahead and click a button to create a new issue from the issue board UI. What if the user doesn't know which of the 5 projects the issue should belong to? Should we force the user to make a decision? Should they care? Should we provide a default project? Defer the decision to later? Use a hidden shadow repo-less project in the background, thus introducing the concept of "group issues"? I can see this getting messy quick.
Agreed that we have to be thoughtful with this @victorwu!
My mentality is a little of the have a vague sense of long-term goal/direction, 'get the simple thing out there', then continue thinking through the more complicated situations, while seeing how the product is used and what feedback we get.
This means, sometimes, I don't mind putting a little more burden on users at first in order to deliver the simple thing. For example, on the scenario you mentioned above, I would start with forcing the user to decide what project they want to add the issue to. I know that sometimes, a user won't want to do that. But fortunately issues can be moved between projects. It is also interesting to see how people figure things out based on the tools you give them. Maybe they will create their own 6th project to dump these unknown issues into first? It would be interesting to see if that solution, or another one, pops up in the wild.
I worry that it is easy to overthink and overdesign for the complicated situations. The variety of teams, company and org structures is staggering. We can't build something that works for all situations from the get-go. But I think we can build something that is reasonable for a lot of situations.
In my opinion multiple repos per project are bad:
- Development: its complex architecture change
- Education: need to explain difference between projects and repository
- UX: it introduces extra layer for users to deal with. Group -> Project -> Repository vs Group -> Project.
I think we should not introduce new hierarchy element under Project ever. Project is a perfect thing for new users to start with. Signup, create a project and start using it. Want something advanced and complex? Use groups. Especially with nested groups that will come in 9.0.
My opinion on Team-centered collaboration:
- I strongly believe both project-centered and team-centered collaboration were, are and will always be.
- Our application was developed mostly for project-centered collaboration. For 5 years. We must not break it
- Agree we should make team-centered collaboration less painful. Lets use group-level features for it.
I think that many software projects consist of multiple repositories. That doesn't mean that what we call Projects should have multiple repositories. In fact, we're basically will have this functionality with nested groups.
I kept it purposely vague in the strategy doc as well: https://about.gitlab.com/direction/strategy/#multi-repository-projects
@dzaporozhets I think we agree that we want to be better at supporting more complex software projects, it probably makes sense to do this at the group level!
Group-level things are also much more flexible :)
I think that many software projects consist of multiple repositories.
Agree. But everyone starts from one. And we want to keep it simple (like it is now).
That doesn't mean that what we call Projects should have multiple repositories.
@JobV thats what I want to prevent. I want us to save
what we call Projectsas lowest hierarchy element. And allow it to be created directly inside user/group without creating something in between
I think we agree that we want to be better at supporting more complex software projects, it probably makes sense to do this at the group level!
Thank you @dzaporozhets for the insight. I definitely don't see us going for multiple repos per project anytime soon, especially given all this discussion. Given there is so much momentum with groups and nested groups, that's a natural area to leverage for team-centered collaboration, and so the inertia is to use that. I really want to see how issue boards will evolve and we can see how that impacts our vision of team vs project centered. I'l leave this issue opened as a reminder for us that we are still lacking in many team-centered use cases. And as issue boards evolve and when 9.0 is released, we can re-evaluate how much more work we need to do.
Thank you especially for bringing up a unique point I haven't considered yet (or others forgive me if I didn't see it earlier in something you pointed out):
Project is a perfect thing for new users to start with.
This is extremely important as GitLab grows in functionality and complexity. This is indeed a powerful abstraction for new users. Just the right balance of power and simplicity. And it's crucial for us to maintain that if want to continue attracting new users and customers.
Being able to group projects into a sub-group is definitely great, but I wonder what the next step is? Assuming sub-groups are possessive of their projects, meaning one project can only belong to one sub-group, then you've got a rigid hierarchy, which works fine for a lot of cases. But when we talk about teams and repos, one thing to recognize is that these are often not exclusive relationships. For example, the CI team wants their own project management, issue boards, etc., but writes code in the same repos as the rest of Gitlab. Even if we were using microservices, that would likely still be true. So team to project is not a hierarchy.
Even when thinking about a "product", it's not always possible to draw unique boundaries between sets of services to declare them in a hierarchy. There's overlapping dependencies, even circular dependencies sometimes.
Is there some world where teams are actually independent things that can be attached to multiple projects in a many-to-many relationship?
Sorry to intervene on this issue but we use GitLab with what I think is a typical use case: we have 1 back-end project that is used for our public server. We also have 3 different repositories for our 3 mobile applications (iOS, Android, Windows) that discuss with our back-end.
For now, we have difficulties planning a new mobile feature since we don't know yet on what platform the feature will be started. To solve that, we created a new "Mobile" project that manages every "high level" issues for mobile development. The purpose of this project is to manage meta-issues that will later create issues into other projects.
Following the whole discussion and specifically @markpundsack's last comment, I was wondering about adding some cross-projects tools? I am not totally clear on what could be possible but I think it can be an alternative to using a rigid hierarchy.
For example, we could add the ability to extend the task concept to easily create/track remote issues. I was thinking about something like
- [ ] project#123that could render as - [ ] link to project link to issue with its title and labels. Checkbox would be readonly and based on the state (open/closed) of the remote issue. We could also add the syntax
- [ ] project# Title of the issue(without issue ID) that will render as a create button to easily create this issue in the remote project and append the issue ID once the issue is created.
It does not solve all the problems stated on this discussion but I think it could help projects collaboration as a whole, and could also be used for other purposes like tracking issues across multiple public (open source?) projects.
On the subject of team issues vs multiple project repos: Even if you did go down the route of multiple repos for a project, I'd still imagine that I'd want the ability to see issues at a group level. Whereas the opposite isn't necessarily true.
I also agree that it should be developed with a mind to the future, but shouldn't be over-engineered into a corner. Release Early and Release Often. E.g. don't worry about forcing issue creators having to choose a particular repo. Add that in later if required. Maybe some teams would want to force people to put their issues straight into a repo.