Establish and document formalized naming conventions for clusters and resources

Overview

During the Component Ownership Model pilot rollout, inconsistent naming conventions for clusters and resources created confusion and made it difficult for teams to understand the infrastructure landscape. This issue tracks the establishment and documentation of formalized naming conventions to ensure consistency across all infrastructure components.

Current Challenges

1. Inconsistent Naming Patterns

Clusters and resources use inconsistent naming patterns
No clear guidelines exist for naming new clusters or resources
Teams create names based on their own conventions, leading to confusion
Inconsistency makes it harder to understand relationships between components

2. Lack of Documentation

No centralized documentation of naming conventions
Teams don't know what naming patterns are expected
New teams must infer conventions from existing resources
This creates friction and requires SRE guidance

3. Naming Confusion During Pilot

During the Data Insights Platform pilot, inconsistent naming was identified as a problem
Teams had to rename resources or clarify naming with SREs
This added unnecessary work and delays

Goals

Establish clear, consistent naming conventions that:

Are easy to understand and follow
Clearly indicate the purpose and environment of a resource
Are documented and discoverable
Reduce confusion and SRE involvement in naming decisions
Scale as the infrastructure grows

Proposed Solutions

Define Naming Convention Guidelines
- Establish patterns for cluster names (e.g., {environment}-{purpose}-{region})
- Establish patterns for resource names (e.g., {project}-{component}-{environment})
- Define abbreviations and their meanings
- Document environment naming (staging, production, etc.)
- Document region naming conventions
- Provide examples of correct and incorrect names
Document Naming Conventions by Resource Type
- GKE clusters
- EKS clusters
- VMs and compute instances
- Databases and storage
- Networks and subnets
- Load balancers and ingress
- Secrets and configuration
- Other infrastructure components
Create Naming Convention Guide
- Create a comprehensive guide in the handbook or runbooks
- Include decision trees for naming different resource types
- Provide templates and examples
- Include common pitfalls to avoid
- Include how to request exceptions if needed
Establish Validation Process
- Define how naming conventions will be validated
- Include naming convention checks in code review process
- Consider automated validation tools
- Document the review process for naming decisions
Communicate and Train
- Share naming conventions with all teams
- Include naming conventions in onboarding materials
- Provide training or examples for teams adopting Component Ownership Model
- Update existing documentation to reference naming conventions
Audit and Standardize Existing Resources
- Audit existing clusters and resources for naming consistency
- Identify resources that don't follow conventions
- Plan migration or renaming of non-compliant resources
- Document any exceptions and their justifications

Related Issues

Component Ownership Model feedback: #27175 (closed)
Infrastructure Support for Usage Billing: gl-infra#1637 (closed)

Related Documentation

Component Ownership Model handbook: https://handbook.gitlab.com/handbook/engineering/infrastructure/production/component-ownership-model/
Runbooks: https://runbooks.gitlab.com/

Success Criteria

Naming convention guidelines are defined for all major resource types
Comprehensive naming convention guide is created and documented
Guide includes examples and templates for each resource type
Naming conventions are included in Component Ownership Model handbook
Naming conventions are included in team onboarding materials
Validation process is defined and documented
Existing resources are audited for compliance
Non-compliant resources are identified and prioritized for remediation
Teams can follow conventions without SRE guidance
Naming conventions are discoverable and easy to find

Assignee Loading

Time tracking Loading