Map RSpec CI jobs to a failure category
From Draft to Ready
-
The MR description is up-to-date
What does this MR do?
Map a CI/CD RSpec job to a failure category, and send this as an internal event.
To go from a CI/CD failure to a failure category, we follow three steps:
- Download the job trace (the
DownloadJobTraceruby class) - Analyze the job trace, and assign a failure category (the
JobTraceToFailureCategoryruby class) - Push the
job <=> failure categorymapping (theReportFailureCategoryruby class)
We also add the FailureAnalyzer class, which is the orchestrator class calling the three others. It's the entrypoint to use in CI jobs.
Also, it's all in Ruby
References
Contributes to Map CI/CD failures to failure categories (#512594 - closed)
Proof of work
First test - A generic ruby error
it 'sends correct event parameters and success message' do
send_event
expect(http_request).to have_received(:body=).with(expected_request_body.to_json)
- expect($stdout).to have_received(:puts).with("Successfully sent data for event: #{event_name}")
+ expect($stdout).to have_received(:puts).with("Successfully sent data for event2: #{event_name}")
end
Report failure category
Successfully sent data for event: glci_job_failure_category
[GCLI Failure Analyzer] Job #9672825806 categorized as: rspec_valid_rspec_errors_or_flaky_tests
rspec_valid_rspec_errors_or_flaky_tests is what I expected
Second test - An infrastructure error
send_event
expect(http_request).to have_received(:body=).with(expected_request_body.to_json)
expect($stdout).to have_received(:puts).with("Successfully sent data for event: #{event_name}")
+
+ raise "GitLab is currently unable to handle this request due to load. This is a test message"
end
end
Report failure category
Successfully sent data for event: glci_job_failure_category
[GCLI Failure Analyzer] Job #9680393236 categorized as: gitlab_too_much_load
gitlab_too_much_load is what I expected
Third test - No errors
I expect skipping the failure analysis/reporting.
[GCLI Failure Analyzer] Missing job trace. Exiting.
[DownloadJobTrace] Job did not fail: exiting early (status: success)
[GCLI Failure Analyzer] Did not find a failure category for job #9680348113.
It skipped the analysis
How to set up and validate locally
Each class can be called like a script locally (1, 2, 3, 4).
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.