quast.sh | assembly stats | Uses QUAST to assess quality of genome assemblies | Genome before filtering/softmasking & Genome after filtering/softmasking | Assembly statistics for both inputs |
#### Preparing TSA Files
#### Preparing TSA
Processing the TSA files from NCBI to be used as evidence for genome annotation tool was accomplished by frame-selecting using [TransDecoder](https://github.com/TransDecoder/TransDecoder/wiki) and clustering with [USearch (v9.0)](https://www.drive5.com/usearch/manual9/).
frameSelect.sh | TSA Prep | Uses [TransDecoder](https://github.com/TransDecoder/TransDecoder/wiki) to identify coding regions in the transcript assemblies and translate into peptide sequences | TSA fasta file | BED, GFF3, CDS (nt coding sequence) & peptide files representing recovered coding regions |
usearch.sh | TSA Prep | Uses [USearch v9.0](https://www.drive5.com/usearch/manual9/) to cluster multiple frame-selected TSAs (**T**ranscriptome **S**hotgun **A**ssembly) by sequence homology into a consensus transcriptome | A single fasta made of concatenated frame-selected TSAs | Clustered reference transcriptome |
#### Evidence Alignment
Short-read and TSA evidence were aligned to genome assemblies using [HISAT2](https://ccb.jhu.edu/software/hisat2/manual.shtml) and [GMAP](http://research-pub.gene.com/gmap/src/README), respectively. Before alignment, short-reads evidence was trimmed QC'd using sickle and [FastQC](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/).
fastqc.sh | short-read QC | Uses FastQC to assess read quality | fastq files from short-read libraries | statistics on read quality in HTML |
sickle.sh | short-read QC | Uses Sickle to trim barcodes & adapters sequences and remove low quality reads | raw fastq files for short-read libraries | trimmed fastq files |
hisatBuild.sh | short-read align | Builds indices to be used by HISAT2 | Length filtered and softmasked genome in fasta format | Set of index files |
hisat.sh | short-read align | Runs HISAT2 short-read aligner | Path to directory contain index built using hisatBuild.sh & path to trimmed reads data | read alignments in SAM format |
convert.sh | short-read align | Uses [samtools](http://samtools.sourceforge.net/) to convert SAM files to the binary, BAM format | sam output of from running hisat.sh | BAM files of short-read alignments |
sort.sh | short-read align | uses samtools to sort BAM files, a prerequisite for merging | unsorted BAM files | sorted BAM files |
merge.sh | short-read align | merges sorted alignments from each short-read library into a single BAM file. | BAM files from each alignment | A single, merged BAM file |