Commit b7f2a8a3 authored by Paul Phillips's avatar Paul Phillips
Browse files

Update Development Analytics team roadmap

parent 768e1a6a
Loading
Loading
Loading
Loading
+9 −22
Original line number Diff line number Diff line
@@ -27,36 +27,23 @@ Support the establishment and enforcement of the Infrastructure Platforms depart
- Enable the DevEx section and Platforms with information about test suite effectiveness, bugs identified by engineers and customers, incidents, and other Production-data to guide engineering teams
- Seek to build solutions into the GitLab product itself, so that our customers can also benefit from what we build.

## FY26-FY27 Roadmap
## FY27 Roadmap

### Now FY26-Q4
### Now FY27-Q2

**Focus: Improve visibility and processes to allow Engineering teams to self service access to test health data. Consolidate Devex data and dashboards** (FY26-Q3 to FY26-Q4)
See the Q2 Planning issue: https://gitlab.com/gitlab-org/quality/analytics/team/-/work_items/573 for the most up to date view of what the team is working on at the moment.

| Epic | Description |
| ------ | ------ |
|  [Build CI Failure Signatures for Pattern Detection and Correlation](https://gitlab.com/groups/gitlab-org/quality/analytics/-/epics/27)| Complete from Q3, add failure categories and signatures to ClickHouse datastore. This will enable real time dashboards and alerts on CI failures, and put in place the data we need to better identify true master broken incidents quickly. This feeds in to pipeline stability improvements as a foundational element.        |
|  [Build single backend test observability solution across all test levels](https://gitlab.com/groups/gitlab-org/quality/analytics/-/epics/28)| Complete from Q3, our ClickHouse based Test Observability dashboards. These dashboards will underpin our work on identifying and fix/delete/quarantine flakey tests, and will support deep links into specific flakey test issues we create, giving engineers much improved visibility into the health of their tests.       |
|  [Improve the quarantine process for flaky tests](https://gitlab.com/groups/gitlab-org/quality/-/epics/259)| Improve Flakey Test detection by moving to ClickHouse based data, and support auto-quarantine system with Test Governance, to drive CI stability. Our success metric here is to drive down the number of flaky tests, and reduce unneeded pipeline failures.     |
|  [Review CI failures and ensure top infrastructure related reasons that fail pipelines are being addressed](https://gitlab.com/groups/gitlab-org/quality/-/epics/263)| Aligned with our DX survey actions around CI stability, we will review the top reasons for CI failures (such as infx issues or timeouts) and create issues with the responsible teams to work through and resolve the issues. Our success metric here is to reducing the amount of unneeded pipeline failures       |
|  [Introduce test coverage observability with ClickHouse and Grafana](https://gitlab.com/groups/gitlab-org/quality/-/epics/240)| Engineering teams lack visibility into test coverage trends and patterns across our codebase. While coverage data is generated during CI/CD, it's trapped in short-lived artifacts. This is a foundational component to being able to surface coverage to teams, to allow them understand how quarantined or deleted tests etc impact their coverage.   |
|  TBD | Support SaaS availability call with dashboards   |
|  [Migrate CI related Development Analytics snowflake dashboards and data to ClickHouse/Grafana](https://gitlab.com/groups/gitlab-org/quality/analytics/-/epics/31)| Migrate CI related Development Analytics snowflake dashboards and data to ClickHouse/Grafana to improve discoverability  |
|  [Migrate existing Devex Dashboards to new Data Path](https://gitlab.com/groups/gitlab-org/quality/analytics/-/epics/29)| Support consolidation of Devex dashboards and data to Grafana/ClickHouse |
### Next FY27-Q3/Q4

See also Q4 Planning issue: https://gitlab.com/gitlab-org/quality/analytics/team/-/issues/309
**Focus: Scale out usage of data/dashboards. Build product features to improve pipeline telemetry and enable  Engineering teams to improve CI performance** (FY27-Q1 to FY27-Q2)

### Next FY27-Q1/Q2

**Focus: Scale out usage of data/dashboards, with improved docs and a centralised landing page for teams. Build product features to improve pipeline telemetry and enable  Engineering teams to improve CI performance** (FY27-Q1 to FY27-Q2)

- Improved master branch broken detection process to improve time to recovery.
- Docs/training/Office Hours sessions to enable teams to use the dashboards/alerts
- Build scalable CI job telemetry reporting (into product, via runners)
- Factory-heavy RSpec test observability, guardrails, and remediation
- MR Process Visibility: reviewer roulette, approval resets, and review timeline
- Help Build scalable CI job telemetry reporting (into product, via runners)
- Dogfood Data Insight Platform Dashboarding capabilities (if ready)
- Triage Ops maintenance and improvements (e.g. Migrate Triage Ops to Runway)

### Later FY27-Q3 and beyond
### Later FY28 and beyond

**Focus: Move from custom tooling to product features**