... | ... | @@ -64,7 +64,29 @@ The last _ab initio_ gene prediction is done on the PN40024 assemblies with [Gen |
|
|
|
|
|
Combining all the predictions generated with the different tools at the preceding step.
|
|
|
|
|
|
### Filtering
|
|
|
### Filtering (AGAT)
|
|
|
|
|
|
The gene models from EvidenceModeler are filtered with the following rules :
|
|
|
|
|
|
Ab initio supported gene models were kept if :
|
|
|
* there are predicted by at least 2 ab initio predictors
|
|
|
* if the start and stop condon was present
|
|
|
* if gene length >= 300 bp
|
|
|
Ab initio supported gene models not matching these constraints were kept if :
|
|
|
* they had a database hit with the Uniprot/SwissProt or NCBI NR database (blastp hits with en evalue <1e-6)
|
|
|
Gene models only supported by evidence data or by lifted annotation were kept if :
|
|
|
* if the start and stop condon was present
|
|
|
* if gene length >= 300 bp
|
|
|
|
|
|
At then end, we have the final gene models (final gff3).
|
|
|
|
|
|
### PASA
|
|
|
|
|
|
Used to add UTR regions in the final GFF3.
|
|
|
|
|
|
### tRNAscan-SE
|
|
|
|
|
|
Used to annotation tRNA
|
|
|
|
|
|
## Data
|
|
|
|
... | ... | |
... | ... | |