Commit 5d4b07bd authored by Neville Sanjana's avatar Neville Sanjana
Browse files

Update Readme.md

parent b44a76ae
......@@ -27,10 +27,8 @@ cat Install.txt
## Example: Cas13 guide RNAs to target the SARS-CoV-2 RNA genome
In the follwoing section, I demonstrate how to predict guide RNA scores for custom target RNAs.
As an example I choose to score guide RNAs to target the [Coronavirus SARS-CoV-2 strain USA/NY1-PV08001/2020](https://nextstrain.org/ncov?c=location&f_division=New%20York&r=country).
This strain represents a close relative to the strain responsible for the recent [coronavirus pandemic](https://www.who.int/dg/speeches/detail/who-director-general-s-opening-remarks-at-the-media-briefing-on-covid-19---11-march-2020), bearing 3 nucleotide substitutions (G3243A, C25214T, G29027T) and two amino acid mutations (N: A252S, ORF1a: G993S).
The positive-sense RNA virus contains 10 genes (ORF1ab, S, ORF3a, E, M, ORF6, ORF7a, ORF8, N, ORF10).
In the following section, I demonstrate how to predict guide RNA scores for custom target RNAs.
As an example, I choose to score guide RNAs to target the [Coronavirus SARS-CoV-2 strain USA/NY1-PV08001/2020](https://nextstrain.org/ncov?c=location&f_division=New%20York&r=country). The positive-sense RNA virus contains 10 genes (ORF1ab, S, ORF3a, E, M, ORF6, ORF7a, ORF8, N, ORF10). This strain was isolated from a patient in New York treated for COVID-19 after returning from Iran, which is one of several countries struggling with the [recent coronavirus pandemic](https://www.who.int/dg/speeches/detail/who-director-general-s-opening-remarks-at-the-media-briefing-on-covid-19---11-march-2020). Upon sequencing (PacBio, 20,000x coverage), the strain was found to have 3 nucleotide substitutions (G3243A, C25214T, G29027T) resulting in two amino acid mutations (N: A252S, ORF1a: G993S), as compared to the original Wuhan isolate of SARS-CoV-2. The GISAID accession ID for this SARS-CoV-2 isolate is EPI_ISL_414476 and we are grateful to Dr. Harm van Bakel at Mt. Sinai for making this data publically available.
To predict guide RNA scores for genes in the SARS-CoV-2 RNA genome, first change directories into the Cas13design folder.
The data directory contains the SARS-CoV-2 RNA genome sequences separated in single entry FASTA files.
......@@ -53,11 +51,11 @@ ls -ltr ./data/*fasta
Next, run the RfxCas13d_GuideScoring.R script by providing 3 required input arguments:
1. the target sequence as a single entry FASTA file
2. the model input data
3. a boolean variable (true/false), if you would like the predctions to be plotted relative to the input sequence.
3. a boolean variable (true/false), if you would like the guide RNA scores to be plotted relative to the input sequence.
```r
# Predict guide RNA scores for the USA/NY1-PV08001/2020 S gene
# Predict guide RNA scores for the USA/NY1-PV08001/2020 spike (S) gene
Rscript ./scripts/RfxCas13d_GuideScoring.R ./data/MN908947_NY1-PV08001.S.fasta ./data/Cas13designGuidePredictorInput.csv true
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment