Augustus retraining parameter generated from BUSCOs
As BUSCO retrains Augustus with BUSCO genes to get species-specific parameters, I was interested in using those species-specific parameters as in Waterhouse et al. (2017) to get as many as possible accurately predicted genes (not only the BUSCO genes) from species not included in the Augustus species. Running Augustus on the genome assembly using as --species the BUSCO-trained parameters from the retraining_parameters folder, I get a .gff file containing few long protein sequences basically lacking the stop codon information. Instead using one of the Augustus default species I get way more shorter protein sequences having the stop codon information in the .gff Augustus output file.
Looking into etraining_err.log I found these two errors(repeated respectively about 9000 and 1600 times ):
Error: In sequence NW_014575207.1_6084-7398: One CDS exon does not begin properly after the previous CDS exon.258 >= 224 GBProcessor::getGeneList(): Intron has non-positive length. Encountered error after reading 1948 annotations.
gene g7.t1 transcr. 1 in sequence NW_014575371.1_295077-297604: Single exon doesn't end in stop codon. Variable stopCodonExcludedFromCDS set right?
I tried to change the stopCodonExcludedFromCDS parameter to true passing it to Augustus through BUSCO (--augustus_parameters='--stopCodonExcludedFromCDS=true') but nothong changes, passing it to false BUSCO cannot complete the run.
I get the same .gff file from Augustus run lacking stop codon information both when using BUSCO3 and BUSCO4 to train Augustus
How can I get accurate retraining parameters which include stop codon? thanks in advance