Skip to content

Problems with GitLab's Knowledge Architecture

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Problem

Our Product principle adherence to our knowledge architecture has led to unintentional consequences resulting in numerous customer pain points that ultimately impact their ability to full adopt and realize the value of GitLab as the DevSecOps platform.

Context

Our Organization > Group > Project knowledge architecture can be classified as a vertical hierarchy.

Pros of a vertical hierarchy:

  • The onboarding is easy.

Cons of a vertical hierarchy:

  • Every knowledge object (code, MR, issue, epic, ...) has to be exactly in one place.
  • The organization has to agree on a single hierarchy for all purposes.
  • Evolution is near impossible.
  • Horizontal connections are lost and/or extremely painful to create and maintain.

Hierarchical structures are usually forced and artificial. Interwingularity is not generally acknowledged — people think they can make things hierarchical, categorizable and sequential when they can’t.

TL;DR: In reality, organizations are multi-dimensional. By requiring customers to decide on a single structure for knowledge, we force them to optimize for a single dimension, which negatively impacts every other dimension for collaborating on knowledge. At the same time, customers value and prioritize least-privileged access.

Real-world customer problems

Organic Group/Project Structure (Most frequently encountered use case)

flowchart TD
    Organization --> Group_A
    Organization --> Group_B
    Group_A --> Project_1
    Group_B --> Project_2
    Group_A --> Milestone
    Project_2 --> Issue_X
    Issue_X --> MR
    Project_1 --> Issue_Y
    Group_A --> Board
    Project_1 --> Issue_Z
    Project_2 --> Issue_W
    Organization --> Project_3
    Project_3 --> Issue_V

   classDef team fill:#7FFFD4
   class Group_A,Milestone,Board,Issue_Y,Issue_X,MR team

The green nodes represent knowledge entities that belong to a single team. In this scenario, the team cannot:

  • View, track, and manage their team's issues within a single location (ex: Board). real-world customer example
  • Associate all of their team's issues with their milestone.
  • View a value stream analytics report including all their knowledge entities.
  • Control any settings or configuration specific to that team without impacting all other teams working out the same groups and projects. (ex: team-specific templates, statuses, or custom fields
  • Apply consistent meta-data to all of the team's issues (ex: a team::foo label would need to be duplicated in both groups...which have different label ids under the hood).
  • For larger customers with thousands of groups and projects, re-organizing groups and projects is a non-starter because of the consequences of repository paths changing at scale

Abstracting permissions into subgroups

flowchart TD
    Organization --> Group_Permissions
    Organization --> Group_A
    Organization --> Group_B
    Group_A --> Subgroup_AA
    Group_B --> Subgroup_BB
    Subgroup_AA --> Project_1
    Subgroup_BB --> Project_2
    Project_1 --> Issue_X --> MR
    Subgroup_BB --> Milestone
    Subgroup_BB --> Board
    Project_2 --> Issue_Y
    Subgroup_AA --> Project_3
    Group_Permissions --> Team_Subgroup
    Team_Subgroup .->|Group Share|Project_1
    Team_Subgroup .->|Group Share|Subgroup_BB
    Project_1 --> Issue_Z

    

classDef team fill:#7FFFD4
class Team_Subgroup,Issue_X,MR,Subgroup_BB,Project_2,Issue_Y,Board,Milestone team

The same constraints exist even when you partially abstract permissions into a separate group/subgroup structure. real-world customer example

Separate planning and repository hierarchy

flowchart TD
    Organization --> Group_Planning
    Organization --> Group_Repos
    Group_Planning --> Subgroup_TeamA
    Group_Planning --> Subgroup_TeamB
    Subgroup_TeamA --> Project_TeamA
    Project_TeamA --> Issue_X
    Group_Repos --> Subgroup_Repos --> Project_MicroserviceA --> MR2
    Subgroup_Repos --> Project_Microservice_B --> MR1
    Subgroup_TeamA --> Milestone
    Subgroup_TeamA --> Board
    

    

classDef team fill:#7FFFD4
class Subgroup_TeamA,Issue_X,MR1,MR2,Project_TeamA,Issue_Y,Board,Milestone team

In this scenario, the team's planning objects are in a single hierarchy, but the code changes are in another hierarchy. This results in:

  • No way to associate the team with the MRs. real-world customer example
  • MRs are not visible in the same location as the team's planning objects.
  • Value stream analytics within the planning subgroup does not include any metrics (cycle time, deploys, ...) for code changes the team makes.

Future State: Organizations will not provide an adequate solution the core problems with our knowledge architecture

flowchart TD
    Organization --> Group_A
    Organization --> Group_B
    Organization --> Milestone
    Organization --> Board
    Group_A --> Project_1
    Group_B --> Project_2
    Project_2 --> Issue_X
    Issue_X --> MR
    Project_1 --> Issue_Y
    Project_1 --> Issue_Z
    Project_2 --> Issue_W
    Organization --> Project_3
    Project_3 --> Issue_V
    User .->|Reporter|Organization
    Organization --> Label_team
    Organization --> VSA_report

   classDef team fill:#7FFFD4
   class Milestone,Board,Issue_Y,Issue_X,MR,Label_team team

We frequently tell customers that the solution to this problem is to create your Boards, Milestones, Iterations, Labels, etc. within the highest-level root group. Today, that still does not solve the problem of horizontal workflows across root subgroups. In a future state, we may propose something like making features available within the Organization.

This is not a scalable solution to the problem because:

  • It does not address the problem of horizontal workflows among all subgroups/projects.
  • It requires customers to grant access to the Organization, which is a non-starter for a substantial number of customers due to the desire for least-privileged access.
  • There is still no way to compare, measure, or track how teams are doing that encompasses all of the parts of the SDLC they contribute to. example real customer problem
  • It would require duplicating features between Namespace and Organization, which is the exact problem we've been working to untangle for years.

Root causes

Groups and Projects are responsible for:

  • Access - to gate the permission of knowledge objects/entities
  • Settings - to control how the system behaves
  • Aggregation - to collect various information together
  • Combination - to allow multiple things to be treated as one when applying access, settings, and when aggregating.
  • Features - to apply features to a given data set

** Read the full discussion

There have been numerous discussions on this topic, but based on what we know today:

  • Abstracting access out of Groups and Projects will not solve the fundamental constraints of a vertical hierarchy knowledge architecture.
  • Maintaining settings as they behave today will not allow teams to selectively apply settings to only a subset of objects within a group or project.
  • Aggregation and combination spanning root groups (or selectively two of 5 subgroups within a root group) is impossible. Moving up to the organization level is not an option with our current permissions model as it does not respect the least privileged access.
  • Reaching Feature parity between Groups and Projects does not solve the core problems customers are experiencing with

TL;DR: A rigid vertical hierarchy + inheritance model will be a significant hurdle to overcome to realize our vision of being an AllOps Platform. Today, it's responsible for an outsized of negative impact on our customers, the product, and ultimately our business.

Business Impact

This problem is a significant contributor to:

  • Plan adoption: Customers cannot adopt Plan, thus limiting growth in EAP revenue. (1)
  • Performance & Availability: Traversing a large group/project hierarchy is resource-intensive and often results in time-outs.
  • Usability problems: This has been noted time and again in multiple UX research projects (1)

How competitors are solving this problem

GitHub

flowchart TD
   Repo_Project --> Issue --> MR
   Issue .->|Linked To|TeamA_Project
   Issue .->|Linked To|TeamB_Project
   TeamA_Project --> Label1
   TeamB_Project --> Label2
   Label1 .-> Issue 
   Label2 .-> Issue

Repository objects (ex: Issue) can be linked to Team projects. Each team project can associate its own meta-data to the issue. All meta-data is visible within the Repository issue, while each team's meta-data is only visible within each team's project.

Asana

flowchart TD
   Organization --> Global_Fields --> FieldA
   Task .->|Linked To|TeamA_Project
   Task .->|Linked To|TeamB_Project
   TeamA_Project --> FieldB
   TeamB_Project --> FieldC
   FieldA --> Task
   FieldB --> Task 
   FieldC --> Task

Tasks are global entities that can be linked to many projects. An organization has a global field library that is shared with all projects. Each project can have its own field library. Global fields + project-owned fields are displayed within a project's aggregation views (ex: board, list). All fields associated with a task are viewable within the task detail but are only accessible/viewable if you have access to the origin project that owns the field.

Related discussions and proposals

Documented examples of where this is causing problems for customers

Edited by 🤖 GitLab Bot 🤖