Skip to content

Instrumentation Audit across Core DevOps

Overview

This initiative aims to audit and assess our current instrumentation across the DevOps product portfolio to understand whether we have the right usage data in place. Quality engineering decisions should be data-driven - we need to know if areas with bugs/tech debt actually have user adoption, or if we should consider deprecation/removal instead of investment.

DRI: @jdrpereira DRI Responsibilities

  • Coordinate with DevOps teams to understand current instrumentation
  • Work cross-functionally to assess data availability
  • Document findings in standardized format for each team
  • Present weekly updates at Product Quality Standup
  • Enable final recommendations with usage-based prioritization

Business Context

As we scale to serve enterprise customers who rely on GitLab as their "Tier 0 platform," we must make strategic decisions about where to invest our engineering capacity. Understanding feature usage helps us:

  • Prioritize bug fixes in high-usage areas vs. burning down bugs against unused features
  • Make informed decisions about tech debt and maintenance vs. complete feature removal
  • Optimize engineering investment based on actual customer value
  • Support our path to $2B revenue by focusing on what customers actually use 🚀

Success Criteria

  • Complete instrumentation coverage audit across all DevOps stages
    • This could be based on Product Category, Feature, or some other thing
  • Identify gaps in usage data collection
    • What usage metrics are currently tracked?
    • Is it the right data for the purpose?
    • Can we access this data easily? Where can it be viewed?
    • Is the existing data reliable and actionable?
  • Establish baseline for ongoing instrumentation health
  • Enable data-driven decisions about feature deprecation vs. investment

For each area with bugs/tech debt, we should be able to provide:

  • Usage assessment: High/Medium/Low/Unknown usage
  • Investment recommendation: Fix/Maintain/Deprecate/Remove
  • Instrumentation improvement plan: What to add/fix to improve decision-making

Work Plan

The following were the key decisions behind the rationale and structure of this plan:

  • Feature-level granularity: Audit individual features, which can then be rolled up to feature categories and stages as needed.
  • Pilot-first approach: Run Package audit first (the initiative DRI's own stage) to validate the framework, confirm timing estimates are realistic, and create a concrete example before scaling to other stages.
  • Distributed execution: Each stage provides their own DRI to conduct audits in parallel, removing bottlenecks and leveraging domain expertise. The initiative DRI provides framework and oversight to ensure consistency and reduce unconscious bias from self-evaluation.

Week 1: Setup & Kickoff

  • Create plan and update issue used to track all related work
  • Meet with Analytics Instrumentation team to understand and assess the available instrumentation tooling and current practices
  • Get access to all relevant data platforms and dashboards
  • Contact EMs/PMs for each stage to introduce initiative and request a DRI to help conduct the audit:
    • Create
    • Deploy
    • Package
    • Plan
    • Runner
    • Verify

Week 2: Audit Framework

  • Create standardized audit template including:
    • Feature-level inventory checklist
    • Feature → Category → Stage rollup
    • Current instrumentation status fields per feature
    • Usage classification criteria (High/Medium/Low/Unknown thresholds)
    • Data quality score (reliability, completeness, accessibility)

Week 3: Pilot Audit & Refinement

  • Conduct pilot audit on Package stage
  • Refine audit framework based on pilot feedback/results
  • Create a scoring system for investment recommendations per feature
  • Document step-by-step guide, use Package audit as example
  • Confirm feasibility of 2 weeks for audit execution on remaining stages based on Package pilot

Week 4-68: Audit Execution

  • Kickoff audit on remaining stages with DRIs
  • Complete feature-level audits on remaining stages:
    • Create
    • Deploy
    • Plan
    • Runner
    • Verify

Week 78-9: Analysis & Deliverables

Analysis

  • Compile all feature-level audit findings into single dataset
  • Create rollup views
  • Cross-reference usage data with bug counts and tech debt
  • Identify patterns across stages (systemic instrumentation gaps)
  • For each feature, determine:
    • Usage assessment: High/Medium/Low/Unknown usage
    • Investment recommendation: Fix/Maintain/Deprecate/Remove
    • Instrumentation improvement plan
  • Create feature-level usage vs. maintenance cost matrix
  • Flag "zombie features" (high maintenance, low usage)
  • Highlight "under-invested gems" (high usage, high bugs)

Stakeholder Review

  • Review findings with each stage EM/PM
  • Refine recommendations based on feedback

Report

Available at https://instrumentation-audit-054c5e.gitlab.io/.

  • Document feature-by-feature breakdown with category/stage rollups
  • Create executive summary with:
    • Top feature deprecation candidates across stages
    • Top feature investment priorities
    • Category and stage level insights
  • Share findings with leadership

Handoff

  • Create specific action items for each team
  • Establish ongoing instrumentation health monitoring process
  • Handoff ownership to EMs/PMs for follow-through

Stage DRIs

Stage DRI
Create @psjakubowska @jwoodwardgl @adebayo_a
Plan @pskorupa @fernanda.toledo
Verify:Pipelines @furkanayhan grouppipeline authoring
@allison.browne grouppipeline execution
Deploy @timofurrer @tigerwnz
Runner @ratchade, @avonbertoldi Category:Runner Core
@pedropombeiro, @narendran-kannan Category:Fleet Visibility
Package @jdrpereira
Edited by João Pereira