Team Ownership Restructuring Proposal: From Silos to Sustainable Knowledge Management
TLDR:
- We start using CODEOWNERS files to clearly show who's responsible for which parts of our code
- We turn off DangerBot since CODEOWNERS will handle reviewer assignments instead
- Create two ownership roles:
- Primary Owners: The go-to experts who'll maintain their components and mentor others
- Secondary Owners: The backup experts who'll with their areas and be a backup go-to person
- Keep a healthy balance:
- Project DRIs will still champion new features
- Component owners will look after code quality and architecture
- Make knowledge sharing a priority:
- Primaries will mentor secondaries and keep the documentation up-to-date
- High review workloads will naturally encourage spreading knowledge around or distributing areas
Current Challenges
Our container registry project faces several critical ownership and knowledge management issues that threaten long-term sustainability:
Technical Knowledge Concentration
The registry codebase continues to grow in size and complexity. Making reliable changes increasingly requires deep technical knowledge about specific areas being modified, creating barriers to contribution.
Dangerous Bus Factor Issues
Several critical areas rely on single individuals:
- Legacy areas untouched for extended periods contain vital context only Hayley and Joao possess. This institutional knowledge will be lost without proper documentation and knowledge transfer.
- Our tendency to work in independent silos makes knowledge transfer difficult. When team members are absent, handovers become complicated and time-consuming.
Cross-Team Knowledge Sharing Pressure
We face the dual challenge of needing to:
- Become GitLab Runner SMEs to share on-call responsibilities with that team
- Mentor the GitLab Runner team to help them become registry SMEs
This creates scheduling conflicts and capacity issues around who can mentor, who can be mentored, and how many mentoring relationships we can sustain simultaneously.
Architectural Drift
Conway's Law suggests our "everyone owns everything" team structure inevitably weakens component boundaries and APIs. Given the registry's size, we risk creating an unmaintainable monolith without clearer ownership boundaries.
Failed Knowledge Sharing Attempts
Simply declaring "we need to break silos" and "we should share knowledge" has proven ineffective. We need structural changes to ensure knowledge transfer actually occurs.
Superficial Code Reviews
Our reviews lack necessary depth in many cases - not from lack of effort but because it's extraordinarily difficult to effectively review changes in components you haven't worked with in 18+ months while focusing on other high-priority areas.
Proposed Solution: Structured Ownership Model
Formalized Ownership Structure
Implement a formal ownership model with:
- Primary Owners: Responsible for heavy lifting and acting as DRIs (Directly Responsible Individuals)
- Secondary Owners: Supported by primaries through issue assignment, mentoring, and guidance on larger changes
Area Identification and Assignment
- Identify distinct areas within the container registry
- Assign primary and secondary owners based on expertise and interest
- Each team member defines their capacity for ownership roles
- Acknowledge that some areas may remain unowned initially, but at least we'll have visibility into these gaps
Example Assignment Structure:
Pawel:
- Primary: Storage drivers, notifications, release automation, CI
- Secondary: HTTP router, DB work, rate limiting
Cross-Team Ownership Integration
- Mentoring someone from the GitLab Runner team counts as having them assigned as a Secondary owner
- Being trained by the GitLab Runner team counts as being assigned a Secondary area
Active Participation Requirement
The key to success is ensuring Secondary owners actively participate. This means:
- Secondaries must perform real work in their assigned areas
- When trained by GitLab Runner, team members need hands-on experience to build practical knowledge
Documentation Responsibilities
Primary and Secondary Owners are responsible for maintaining up-to-date documentation for their components. While the ownership model establishes clear accountability for knowledge areas, it is not intended to replace proper documentation. Instead, it complements it by ensuring that:
- Component documentation is regularly maintained and updated by those most familiar with the code
- The "why" behind technical decisions is properly documented, not just the "how"
- General component purpose, functionality, and architecture is clearly explained
Documentation format remains flexible based on the component's needs and the owners' judgment. At minimum, components should have documentation that covers:
- Core purpose and functionality
- Key architectural decisions and their rationales
- Known workarounds or special considerations
- Integration points with other components
This approach balances the benefits of having "living repositories" of knowledge (the Primary/Secondary owners) with the necessity of persisted documentation that remains accessible when specific team members are unavailable. The precise documentation threshold will vary by component and the owner.
Knowledge Transfer Mechanism
The knowledge that is not suitable for documenting, aka the "living repository" must be actively transferred from Primaries to Secondaries. Secondaries serve as true backups who can take over when Primaries are unavailable. This creates a built-in redundancy that mitigates bus factor risks.
Primary owners will be evaluated partly on how effectively they train and mentor their Secondaries. This creates a natural feedback loop and incentive to ensure knowledge sharing actually occurs.
Balancing Feature Development and Codebase Health
The ownership model complements our existing DRI system through balanced responsibilities:
- DRIs advocate for features and product requirements
- Primary/Secondary Owners advocate for codebase health and component integrity
This creates natural checks and balances:
- DRIs drive feature completion and product evolution
- Component owners ensure maintainable code and sustainable architecture
Team members will rotate through both roles, developing empathy for different perspectives. In practice:
- DRIs consult with component owners during planning
- Component owners provide architectural guidance
- Each role maintains ultimate responsibility for their domain
- Conflicts resolve through collaborative discussion
This approach prevents technical debt accumulation while ensuring continuous product development.
Technical Implementation
- Leverage
OWNERSfiles in our Git repository to divide the container registry into distinct areas - Configure automatic assignment of the Primary owner for review when changes touch their areas
- Once Primary owners approve changes, they can assign secondary reviewers at their discretion
Built-in Feedback Loop
- Reviewer workload becomes a secondary metric
- If someone has too many reviews as a Primary, they either:
- Own too many areas and should consider reducing scope, or
- Are doing too much outside work and should reduce involvement
- This creates a natural incentive to distribute ownership and train others
- The feedback loop reinforces knowledge sharing: more ownership = more reviews = greater incentive to train others