Create a Dashboard for Observing E2E AI Functional Test Results and Stability Metrics

Description

From the recent discussion with @tlinz , a need has been identified to create a centralized dashboard that provides visibility into the stability of different AI features. While we have the visibility of qualitative metrics, we lack visibility of functional validation. We already have E2E test coverage for AI features listed here - https://gitlab.com/gitlab-org/quality/quality-engineering/team-tasks/-/issues/2410 and we are running the tests in schedules. This dashboard will provide visibility of test results on a daily basis.

Objectives

Centralize E2E AI Functional Test Dashboard: Collect and display results from all E2E test runs in a single, easy-to-access dashboard, highlighting pass/fail rates and trends.
Monitor Stability Metrics: Include key metrics that indicate the stability of test cases over time, such as frequency of failures and performance inconsistencies.
Visibility for Stakeholders: Provide a clear and concise view of test results, test descriptions etc for stakeholders, enabling informed decision-making and quicker identification of problem areas.

Requirements

Integrate data from existing E2E test coverage reports, similar to the qualitative dashboard - https://lookerstudio.google.com/u/0/reporting/151b233a-d6ad-413a-9ebf-ea6efbf5387b/page/p_tn84969jfd
Display real-time and historical data on test stability.
Highlight areas with the highest failure rates or recurring issues.
Include visualizations (e.g., graphs or charts) to make data interpretation easier.
Ensure the dashboard can track ongoing and upcoming tests, with an option to drill down into specific failures or flaky tests.

Priority: High – Stability concerns are critical for stakeholder visibility and effective monitoring of AI features

Edited Nov 13, 2024 by Abhinaba Ghosh