Test History MVC - Test Summary Widget
<!-- The first four sections: "Problem to solve", "Intended users", "User experience goal", and "Proposal", are strongly recommended, while the rest of the sections can be filled out during the problem validation or breakdown phase. However, keep in mind that providing complete and relevant information early helps our product team validate the problem and start working on a solution. --> ### Problem to solve Developers do not have an easy way to get the history of if a test has recently been passing, failing or skipped as they research a test failure in the context of a test run in an MR. Without this context it is hard to point at why the build is feeling/getting slower and what test cases may to be blame for that. Without that context it's not hard to imagine Delaney and Sasha starting to skip tests and not write tests for new code. <!-- What problem do we solve? Try to define the who/what/why of the opportunity as a user story. For example, "As a (who), I want (what), so I can (why/value)." --> ### Intended users * [Sasha (Software Developer)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#sasha-software-developer) * [Delaney (Development Team Lead)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#delaney-development-team-lead) ### User experience goal <!-- What is the single user experience workflow this problem addresses? For example, "The user should be able to use the UI/API/.gitlab-ci.yml with GitLab to <perform a specific task>" https://about.gitlab.com/handbook/engineering/ux/ux-research-training/user-story-mapping/ --> The user should be able to use the UI with GitLab to see if a test has passed or failed more often over the last N executions in the default branch in the test summary. ### Proposal This is an MVC proposal to measure if there is interest / value in this feature. If there is we will likely re-architect this to be more scalable and functional going forward. * Count failures by test for any test execution in the project on the default `branch` * We changed course to use the `default` branch for comparison and realize it may limit what we can learn in the MVC but the implementation was made easier so we can learn faster. * If a job/pipeline generates a lot of errors (TBD how many by Back end developer) Stop counting and do not increment the count for existing tests. This is likely the result of something else that went wrong and this would lead to noisy data for users. For Example the database was down so all the tests are failing, we shouldn't count that as a test failure. * This data will be sent along with the test case entity already used by the Test Summary Widget. * Display how many times a test has failed over the last X days (history of the test) for each failed test that has a previous failure recorded in the /default branch. * Track expansions of the Test Summary widget as a snowplow event we can display in the [group dashboard](https://app.periscopedata.com/app/gitlab/633395/Testing-Category-Metrics). ##### Application Logic ###### Backend After parsing a JUnit XML report file, if: 1. The feature flag is enabled for this project 1. The file contains less than 200 total failures We will increment a persisted counter for each test that failed. --- We will create an endpoint/method that will retrieve a count of how many times a test has failed in the last 14 days. ###### Frontend Upon loading the MR page, if: 1. The feature flag is enabled for this project 1. The JUnit XML report file is done processing 1. The JUnit XML report file contains failures 1. The JUnit XML report failures are loaded into the MR widget We will build the recent failures data into the data for the MR Widget, then draw the tooltips on the frontend with that information. #### Postgres We will store the failures and test cases in the database in tables like: ``` ci_test_cases(project_id, test_case_name) has_many ci_test_case_failures(ci_test_case_id, failed_at, ref) ``` Then query those tables to create the necessary endpoints as described here: https://gitlab.com/gitlab-org/gitlab/-/issues/241759#note_417521675 * For the `test_case_name` we'll need a reasonable limit (text limit). * I suppose `failed_at` is needed for doing statistical calculations, since the same data is probably available in one of the CI tables. * For inserting records in batches, check out: https://docs.gitlab.com/ee/development/insert_into_tables_in_batches.html * It might make sense at some point to aggregate and store the failure counts (last 14 days) on the `ci_test_cases` table. Other possible technical implementations are discussed here: https://gitlab.com/gitlab-org/gitlab/-/issues/235525#note_394298215 <!-- How are we going to solve the problem? Try to include the user journey! https://about.gitlab.com/handbook/journeys/#user-journey --> ### Further details <!-- Include use cases, benefits, goals, or any other details that will help us understand the problem better. --> Before investing in time to architect a real solution for this to start on the epic for [Testing History for MRs and Pipelines](https://gitlab.com/groups/gitlab-org/-/epics/4155) or even the MVC for [Test History for Projects](https://gitlab.com/gitlab-org/gitlab/-/issues/210250) we will validate that there is value and move the feature forward with this MVC. We fully expect not to re-use code and maybe interface built as part of this issue as we learn what is valuable. The ~backend should be complete in this issue, which should then be re-usable in https://gitlab.com/gitlab-org/gitlab/-/issues/235525 which only has a ~frontend label. ### Permissions and Security <!-- What permissions are required to perform the described actions? Are they consistent with the existing permissions as documented for users, groups, and projects as appropriate? Is the proposed behavior consistent between the UI, API, and other access methods (e.g. email replies)? Consider adding checkboxes and expectations of users with certain levels of membership https://docs.gitlab.com/ee/user/permissions.html * [ ] Add expected impact to members with no access (0) * [ ] Add expected impact to Guest (10) members * [ ] Add expected impact to Reporter (20) members * [ ] Add expected impact to Developer (30) members * [ ] Add expected impact to Maintainer (40) members * [ ] Add expected impact to Owner (50) members --> ### Documentation * Add a note to [existing documentation](https://docs.gitlab.com/ee/ci/unit_test_reports.html#how-it-works) that number of times a test has failed previously is now inclued. * Update the screenshot for the newly refined UI. ### Availability & Testing <!-- This section needs to be retained and filled in during the workflow planning breakdown phase of this feature proposal, if not earlier. What risks does this change pose to our availability? How might it affect the quality of the product? What additional test coverage or changes to tests will be needed? Will it require cross-browser testing? Please list the test areas (unit, integration and end-to-end) that needs to be added or updated to ensure that this feature will work as intended. Please use the list below as guidance. * Unit test changes * Integration test changes * End-to-end test change See the test engineering planning process and reach out to your counterpart Software Engineer in Test for assistance: https://about.gitlab.com/handbook/engineering/quality/test-engineering/#test-planning --> ### What does success look like, and how can we measure that? #### Acceptance Criteria * Test execution data is tracked for all failed tests and displayed on the test summary MR Widget * MR can still load as quickly as it does today * Opens of the Test Summary Widget are tracked - A [separate issue has been created for this] ``` GIVEN a junit artifact is uploaded during an MR pipeline run GIVEN there is at least one failed test in the junit report artifact WHEN you visit the MR page WHEN you expand the test summary in the MR widget section THEN there exists a tooltip beside the failing test indicating how many times its' failed recently ``` #### Measures of Success * This feature will be successful if there are at least 500 opens of the test widget per week the first full week 30 days after launch. * This feature will be successful if we collect any feedback from internal and external users within 45 days of launch to inform the next iteration. <!-- Define both the success metrics and acceptance criteria. Note that success metrics indicate the desired business outcomes, while acceptance criteria indicate when the solution is working correctly. If there is no way to measure success, link to an issue that will implement a way to measure this. --> ### What is the type of buyer? The buyer for this is the developer who wants to see how a test is behaving previously to inform their current MR. <!-- What is the buyer persona for this feature? See https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/buyer-persona/ In which enterprise tier should this feature go? See https://about.gitlab.com/handbook/product/pricing/#four-tiers --> ### Is this a cross-stage feature? No. <!-- Communicate if this change will affect multiple Stage Groups or product areas. We recommend always start with the assumption that a feature request will have an impact into another Group. Loop in the most relevant PM and Product Designer from that Group to provide strategic support to help align the Group's broader plan and vision, as well as to avoid UX and technical debt. https://about.gitlab.com/handbook/product/#cross-stage-features --> ### Links / references Feedback for this MVC will be collected in the [feedback issue](https://gitlab.com/gitlab-org/gitlab/-/issues/242233). ### Follow-up Issues to create: - [ ] Create daily clean-up job to ensure failure table has 14 days of data only - [ ] Add per-pipeline limit of 1000 failures stored - [ ] Add per-project limit of 100k failures stored at once - [ ] Aggregate failures in last 14 days in background job and persist into ci_test_case table to avoid expensive GROUP BY COUNT queries <!-- Label reminders - you should have one of each of the following labels if you can figure out the correct ones -->
issue