Problems with GitLab's Knowledge Architecture
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
Problem
Our Product principle adherence to our knowledge architecture has led to unintentional consequences resulting in numerous customer pain points that ultimately impact their ability to full adopt and realize the value of GitLab as the DevSecOps platform.
Context
Our Organization > Group > Project
knowledge architecture can be classified as a vertical hierarchy.
Pros of a vertical hierarchy:
- The onboarding is easy.
Cons of a vertical hierarchy:
- Every knowledge object (code, MR, issue, epic, ...) has to be exactly in one place.
- The organization has to agree on a single hierarchy for all purposes.
- Evolution is near impossible.
- Horizontal connections are lost and/or extremely painful to create and maintain.
Hierarchical structures are usually forced and artificial. Interwingularity is not generally acknowledged — people think they can make things hierarchical, categorizable and sequential when they can’t.
TL;DR: In reality, organizations are multi-dimensional. By requiring customers to decide on a single structure for knowledge, we force them to optimize for a single dimension, which negatively impacts every other dimension for collaborating on knowledge. At the same time, customers value and prioritize least-privileged access.
Real-world customer problems
Organic Group/Project Structure (Most frequently encountered use case)
flowchart TD
Organization --> Group_A
Organization --> Group_B
Group_A --> Project_1
Group_B --> Project_2
Group_A --> Milestone
Project_2 --> Issue_X
Issue_X --> MR
Project_1 --> Issue_Y
Group_A --> Board
Project_1 --> Issue_Z
Project_2 --> Issue_W
Organization --> Project_3
Project_3 --> Issue_V
classDef team fill:#7FFFD4
class Group_A,Milestone,Board,Issue_Y,Issue_X,MR team
The green nodes represent knowledge entities that belong to a single team. In this scenario, the team cannot:
- View, track, and manage their team's issues within a single location (ex: Board). real-world customer example
- Associate all of their team's issues with their milestone.
- View a value stream analytics report including all their knowledge entities.
- Control any settings or configuration specific to that team without impacting all other teams working out the same groups and projects. (ex: team-specific templates, statuses, or custom fields
- Apply consistent meta-data to all of the team's issues (ex: a
team::foo
label would need to be duplicated in both groups...which have different label ids under the hood). - For larger customers with thousands of groups and projects, re-organizing groups and projects is a non-starter because of the consequences of repository paths changing at scale
Abstracting permissions into subgroups
flowchart TD
Organization --> Group_Permissions
Organization --> Group_A
Organization --> Group_B
Group_A --> Subgroup_AA
Group_B --> Subgroup_BB
Subgroup_AA --> Project_1
Subgroup_BB --> Project_2
Project_1 --> Issue_X --> MR
Subgroup_BB --> Milestone
Subgroup_BB --> Board
Project_2 --> Issue_Y
Subgroup_AA --> Project_3
Group_Permissions --> Team_Subgroup
Team_Subgroup .->|Group Share|Project_1
Team_Subgroup .->|Group Share|Subgroup_BB
Project_1 --> Issue_Z
classDef team fill:#7FFFD4
class Team_Subgroup,Issue_X,MR,Subgroup_BB,Project_2,Issue_Y,Board,Milestone team
The same constraints exist even when you partially abstract permissions into a separate group/subgroup structure. real-world customer example
Separate planning and repository hierarchy
flowchart TD
Organization --> Group_Planning
Organization --> Group_Repos
Group_Planning --> Subgroup_TeamA
Group_Planning --> Subgroup_TeamB
Subgroup_TeamA --> Project_TeamA
Project_TeamA --> Issue_X
Group_Repos --> Subgroup_Repos --> Project_MicroserviceA --> MR2
Subgroup_Repos --> Project_Microservice_B --> MR1
Subgroup_TeamA --> Milestone
Subgroup_TeamA --> Board
classDef team fill:#7FFFD4
class Subgroup_TeamA,Issue_X,MR1,MR2,Project_TeamA,Issue_Y,Board,Milestone team
In this scenario, the team's planning objects are in a single hierarchy, but the code changes are in another hierarchy. This results in:
- No way to associate the team with the MRs. real-world customer example
- MRs are not visible in the same location as the team's planning objects.
- Value stream analytics within the planning subgroup does not include any metrics (cycle time, deploys, ...) for code changes the team makes.
Future State: Organizations will not provide an adequate solution the core problems with our knowledge architecture
flowchart TD
Organization --> Group_A
Organization --> Group_B
Organization --> Milestone
Organization --> Board
Group_A --> Project_1
Group_B --> Project_2
Project_2 --> Issue_X
Issue_X --> MR
Project_1 --> Issue_Y
Project_1 --> Issue_Z
Project_2 --> Issue_W
Organization --> Project_3
Project_3 --> Issue_V
User .->|Reporter|Organization
Organization --> Label_team
Organization --> VSA_report
classDef team fill:#7FFFD4
class Milestone,Board,Issue_Y,Issue_X,MR,Label_team team
We frequently tell customers that the solution to this problem is to create your Boards, Milestones, Iterations, Labels, etc. within the highest-level root group. Today, that still does not solve the problem of horizontal workflows across root subgroups. In a future state, we may propose something like making features available within the Organization.
This is not a scalable solution to the problem because:
- It does not address the problem of horizontal workflows among all subgroups/projects.
- It requires customers to grant access to the Organization, which is a non-starter for a substantial number of customers due to the desire for least-privileged access.
- There is still no way to compare, measure, or track how teams are doing that encompasses all of the parts of the SDLC they contribute to. example real customer problem
- It would require duplicating features between
Namespace
andOrganization
, which is the exact problem we've been working to untangle for years.
Root causes
Groups and Projects are responsible for:
- Access - to gate the permission of knowledge objects/entities
- Settings - to control how the system behaves
- Aggregation - to collect various information together
- Combination - to allow multiple things to be treated as one when applying access, settings, and when aggregating.
- Features - to apply features to a given data set
There have been numerous discussions on this topic, but based on what we know today:
- Abstracting access out of Groups and Projects will not solve the fundamental constraints of a vertical hierarchy knowledge architecture.
- Maintaining settings as they behave today will not allow teams to selectively apply settings to only a subset of objects within a group or project.
- Aggregation and combination spanning root groups (or selectively two of 5 subgroups within a root group) is impossible. Moving up to the organization level is not an option with our current permissions model as it does not respect the least privileged access.
- Reaching Feature parity between Groups and Projects does not solve the core problems customers are experiencing with
TL;DR: A rigid vertical hierarchy + inheritance model will be a significant hurdle to overcome to realize our vision of being an AllOps Platform. Today, it's responsible for an outsized of negative impact on our customers, the product, and ultimately our business.
Business Impact
This problem is a significant contributor to:
- Plan adoption: Customers cannot adopt Plan, thus limiting growth in EAP revenue. (1)
- Performance & Availability: Traversing a large group/project hierarchy is resource-intensive and often results in time-outs.
- Usability problems: This has been noted time and again in multiple UX research projects (1)
How competitors are solving this problem
GitHub
flowchart TD
Repo_Project --> Issue --> MR
Issue .->|Linked To|TeamA_Project
Issue .->|Linked To|TeamB_Project
TeamA_Project --> Label1
TeamB_Project --> Label2
Label1 .-> Issue
Label2 .-> Issue
Repository objects (ex: Issue) can be linked to Team projects. Each team project can associate its own meta-data to the issue. All meta-data is visible within the Repository issue, while each team's meta-data is only visible within each team's project.
Asana
flowchart TD
Organization --> Global_Fields --> FieldA
Task .->|Linked To|TeamA_Project
Task .->|Linked To|TeamB_Project
TeamA_Project --> FieldB
TeamB_Project --> FieldC
FieldA --> Task
FieldB --> Task
FieldC --> Task
Tasks are global entities that can be linked to many projects. An organization has a global field library that is shared with all projects. Each project can have its own field library. Global fields + project-owned fields are displayed within a project's aggregation views (ex: board, list). All fields associated with a task are viewable within the task detail but are only accessible/viewable if you have access to the origin project that owns the field.
Related discussions and proposals
- For issues, see "linked items" below.
- Manage users via Teams in GitLab and transition... (&122)