Improve cng signature diversity ratio
What does this MR do and why?
Improves log normalization by adding support for additional dynamic values that helps reduce the failure signature diversity score for cng.
Changes
- Better warnings handling - system now recognizes patterns for warning fragments, normalizes them consistently
- Enhanced testing Expected impact & dry-runs
- Before (Master branch) - Diversity ratio: 35.4%
Download the CSV file from https://app.snowflake.com/ys68254/gitlab/w33qYqpFqfDA
fca --csv <path to csv>
./bin/analyze_signatures results.csv --category cng
Click to view logs
./bin/analyze_signatures results.csv --category yarn_run
================================================================================
CI FAILURE SIGNATURE ANALYSIS
================================================================================
Data source: results.csv
Category filter: cng
Total records: 326
DETAILED ANALYSIS FOR: assets_compilation
----------------------------------------------------------------------------------------------------
Total failures: 326
Unique signatures: 64
Diversity ratio: 35.4%
-
After (
prsharma-cng-failuresbranch) - Diversity score: 19.6%With the new normalization rules applied, the diversity score for
cngis significantly reduced from 35.4% to 19.6%.
As we normalize more regex patterns, we're reducing the diversity score no very mush. By normalizing the following strings, we reduced the diversity score by 0.3%:
- Successfully packaged chart and saved it to: /<path>/orchestrator<hash>/<path>/gitlab<unique-hash>/gitlab-<ID>.<ID>.tgz + Successfully packaged chart and saved it to: /<path>/orchestrator<hash>-<ID><unique-hash>/gitlab<unique-hash>/gitlab-<ID>.<ID>.tgz
Click to view logs
➜ triage-ops git:(prsharma-cng-failures) ✗ ./bin/analyze_signatures results.csv --category cng
====================================================================================================
CI FAILURE SIGNATURE ANALYSIS
====================================================================================================
Data source: results.csv
Category filter: cng
Total records: 326
DETAILED ANALYSIS FOR: cng
----------------------------------------------------------------------------------------------------
Total failures: 326
Unique signatures: 64
Diversity ratio: 19.6%
Related Issue
Edited by Pranshu Sharma