Commit b022dc9b authored by Shubham Mukherjee's avatar Shubham Mukherjee


parent 3f707d1a
# Hadoop Deployment for Gesall Alignment Jar
The first step in the Hadoop Deployment Pipeline is running this alignment jar which takes as input FASTQ Files and outputs multiple BAM Files.
The `clean` and the `markduplicate` jars, operate on these generated BAM Files.
## Pre-processing
Before running the jar we need to the raw FASTQ Files in a distributed manner across all machines as the process is highly memory intensive.
## Hadoop Configuration
Current hadoop configuration for running aligment jar can be found in : `/share/apps/hadoop/etc/hadoop_align_latest`. To setup your own configuration :
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment