Improving issue triage using Machine Learning
We are starting to have further discussions about automatic application of labels as part of the issue triage process.
Triage requirements
- Group label
- Stage label
- Type label
- Severity label (not required for features)
- Category labels (optional)
Currently covered:
- Stage and group label inference from category labels
Using the data from existing issues aim to categorise new issues. We can break this down into two phases for the initial work:
- Automatically apply type label using Machine Learning
- Automatically apply category label using Machine Learning
- An accurate way to determine category labels would be very useful because after the application of categories the existing inference can apply stage and group labels
- Detecting existing duplicate issues
Resources
- Existing external Python/Flask/tensorflow solution for automatic label application
- GitLab issue for product feature
- Duplicate detection
Phase 1
Tensorflow solution should be able to apply both type labels and category labels based on issue description and title.
Trials
- Run tensorflow suggestions on a weekly basis to test the classifier
- Using triage-ops we can collect the issues created in the previous week. The majority of these issues will have been manually triaged
- In the weekly job we can compare the labels suggested by tensorflow solution with those added by the triage team
Edited by Mek Stittri