hybran should parse RATT's correction report to catch more potential `pseudos`
Thanks to @aswing2364 for pointing out this example:
The only pseudo
s we've been labeling out of RATT's results so far have been those with discontinuous intervals and those that end up fusing with their downstream neighbors.
In 1-0006, Rv0061c is annotated as 200bp long compared to the reference's 380bp.
RATT's 1-0006.1.Report.txt has
Gene_ID error StartBad StopBad frameshifts splicesites length product errorStill StartStillBad StopStillBad frameshiftsStill JoinExons PossiblePseudo CorrectionLog
Rv0061c 2 0 1 1 0 0 Hypothetical protein" 0 0 0 0 0 0 // Corrected Stop
Hybran could pretty easily parse this file and examine the data here to make call. For example, the fact that RATT found a frameshift in the gene and had to correct the stop position should result in addition of the pseudo
tag to the transferred annotation.