Skip to content

Draft: POC -- Predict pipeline failures with AI

Payton Burdette requested to merge predicting-pipeline-failures-poc into master

What does this MR do and why?

As Part of the Verify AI hackathon, Payton and Vlad have decided to work on predicting the likelihood of pipeline failures with AI 🎱

The purpose of this MR is to create a POC (proof of concept) to showcase how we expect this feature to work.

What to expect from this MR

This MR will generate a working proof of concept that includes a button in the pipeline view that triggers an AI command tool to be run that performs a basic analysis of whether the pipeline will pass or fail and prints the output to the duo chat window.

What not to expect from this MR

In this MR we will not add a new feature flag or test coverage given the short period of time. Instead we will be focusing on the feature work and if this happens to go somewhere, we can work on making this code more acceptable. But for now our goal is to just get something out there within a short period of time.

Demo Link 📹

https://youtu.be/E28IE0yOhoE (6/20/2024)

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Screenshots or screen recordings

Button location (pipeline header)

Screenshot 2024-06-17 at 5.27.28 PM.png

Chat interaction

Screenshot_2024-06-20_at_12.16.00_PM

AI results

For this initial proof of concept the AI command tool is asked to perform a basic analysis of a subset of historical pipeline data vs the current pipeline's metrics including the following:

  • :metrics for the current pipeline being analyzed by the tool(pipeline.merge_request.metrics --> such as how many lines were added or removed, number of files changed etc...) as JSON
  • the :status, :failure_reason and :metrics for the last 200 completed pipelines within the same project in the form an array of pipelines converted to json with only the relevant attributes
  • the ai prompt requests an analysis of the metrics for the current pipeline in comparison to the historical pipelines in order to generate a Probability of Failure as a percentage, a suggested action (cancel or let the pipeline run), and an overview/summary of the actual data analysis

The following is an example of the AI output(the actual output from the demo video)

Probability of Failure: 20%

Suggested Action: Let the pipeline run

Summary of Analysis:

Based on the provided information, the current pipeline and the past pipelines, the probability of the current pipeline failing is approximately 20%.

The key factors that contribute to this assessment are:

Past Pipeline Success Rate: The past pipelines have all been successful, with no failures or known failure reasons. This suggests that the project and the pipeline configuration are generally stable and reliable.

Similarity of Metrics: The metrics for the current pipeline are very similar to the metrics of the previous successful pipeline. This indicates that the changes being implemented in the current pipeline are likely to be of a similar nature and complexity as the previous successful ones.

Lack of Failure Reasons: The past pipelines did not have any reported failure reasons, which suggests that the pipeline configuration and the codebase are well-tested and maintained.

Timing of the Pipeline: The current pipeline was triggered recently, and the duration of the pipeline execution is within the expected range, indicating that there are no obvious issues with the pipeline setup or the code changes.

Based on these factors, the probability of the current pipeline failing is relatively low, around 20%. The project and the pipeline configuration appear to be stable, and the changes being implemented are likely to be similar to the previous successful ones.

Therefore, the suggested action is to let the pipeline run, as it has a high chance of succeeding. Canceling the pipeline at this stage would not be necessary, as the risk of failure is relatively low.

How to set up and validate locally

  1. Setup Duo locally (see the docs from the AI team)
  2. Ensure you have duo active for a local group that has existing pipelines ($ GITLAB_SIMULATE_SAAS=1 RAILS_ENV=development bundle exec rake 'gitlab:duo:setup[<test-group-name>]')
  3. Navigate to a project in a duo enabled group and create a new MR or trigger a pipeline
  4. While the pipeline is running click Predict failure button (see above screenshot for location)
    • the duo chat panel should open on the right side of the screen
    • it may take a number of seconds for duo to generate an answer
    • this will only work for currently running or pending pipelines
    • running /predict_pipeline directly in duo will cause an error as the command tool requires a pipeline id to run(which is passed through the graphql chat mutation from the view)
Edited by Vlad Wolanyk

Merge request reports