README.md 4.31 KB
Newer Older
Kristian's avatar
Kristian committed
1 2 3 4 5 6
# REPO MIGRATED TO GITHUB: https://github.com/KHanghoj/DamMet.

This version is outdated and will be removed. 



Kristian's avatar
Kristian committed
7 8 9
[![install with bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat)](http://bioconda.github.io/recipes/dammet/README.html)


Kristian's avatar
Kristian committed
10
# DamMet, a full probabilistic model for mapping ancient methylomes #
Kristian's avatar
Kristian committed
11 12

-------------------------------------------------------------------------------
Kristian's avatar
Kristian committed
13

Kristian's avatar
Kristian committed
14 15
DamMet is probabilistic model for mapping ancient methylomes using sequencing data underlying an ancient specimen.
The model is implemented as a two step procedure. The first step recovers a maximum likelihood estimate (MLE) of the position specific deamination rates for methylated and unmethylated cytosine residues. The second step, making use of these deamination rates, returns a MLE of the methylation level in a user-defined genomic window. The two step procedure as implemented in DamMet is fully 
Kristian's avatar
Kristian committed
16

Kristian's avatar
Kristian committed
17
## Installation ##
Kristian's avatar
Kristian committed
18
DamMet is dependent on [htslib](https://github.com/samtools/htslib.git) and [nlopt](https://nlopt.readthedocs.io/en/latest). Both software will be downloaded and installed with DamMet.
Kristian's avatar
Kristian committed
19 20

``` bash
Kristian's avatar
Kristian committed
21
  git clone https://gitlab.com/KHanghoj/DamMet.git
Kristian's avatar
Kristian committed
22
  cd DamMet && make && cd ..
Kristian's avatar
Kristian committed
23
```
Kristian's avatar
Kristian committed
24
[zlib](https://zlib.net/) and [cmake](https://cmake.org/download) should be globally installed.
Kristian's avatar
Kristian committed
25

Kristian's avatar
Kristian committed
26
## How to run DamMet ##
Kristian's avatar
Kristian committed
27

Kristian's avatar
Kristian committed
28
DamMet takes three required arguments, a bam file (-b), a reference genome (-r), and the chromosome of interest (-c). To demonstrate that DamMet works and produces the expected output, we have made a small running example.
Kristian's avatar
Kristian committed
29 30 31 32

### Running example ###

``` bash
Kristian's avatar
Kristian committed
33 34 35
  git clone https://gitlab.com/KHanghoj/DamMet-tutorial.git
  cd DamMet-tutorial
  git clone https://gitlab.com/KHanghoj/DamMet.git
Kristian's avatar
Kristian committed
36
  cd DamMet && make && cd ..
Kristian's avatar
Kristian committed
37
  bash run.sh
Kristian's avatar
Kristian committed
38 39
```

Kristian's avatar
Kristian committed
40
The output plot (*result.pdf*) of this running example should be identical to *result.expected.pdf*. The main script in this example ('run.sh') can easily be used as a template to suit any analyses of interest.
Kristian's avatar
Kristian committed
41

Kristian's avatar
Kristian committed
42
### Options, Input and Output formats ###
Kristian's avatar
Kristian committed
43

Kristian's avatar
Kristian committed
44
All available options followed by a description can displayed by running DamMet without any arguments (`./DamMet/DamMet`). 
Kristian's avatar
Kristian committed
45

Kristian's avatar
Kristian committed
46
#### Special Input formats ####
Kristian's avatar
Kristian committed
47

Kristian's avatar
Kristian committed
48
1. *-R* allows the user to provide a list of read groups (ID) that should be considered individually for estimating deamination rates. Readgroups not present in the input file will be ignored and their sequencing data is not considered for any analyses. The input file should contain a single read group name (ID) per line. If the user does not provide a file to *-R* all, all sequencing data with be merged into a single read group. The latter is the default in DamMet.
Kristian's avatar
Kristian committed
49 50
2. *-E* allows the user to provide genomic sites that should be excluded. The file show contain a site per line (e.g. chr20 100001). The genomic position should be 1-based.
3. *-e* allows the user to provide genomic regions that should be excluded. The file should take the form of a standard BED file.
Kristian's avatar
Kristian committed
51

Kristian's avatar
Kristian committed
52
#### Output format ####
Kristian's avatar
Kristian committed
53

Kristian's avatar
Kristian committed
54
Two types of files are produced by DamMet, namely a *CHR.READGROUP.deamrates* file and a *CHR.READGROUP.[BED].F* file.
Kristian's avatar
Kristian committed
55

Kristian's avatar
Kristian committed
56 57 58 59 60 61
##### *CHR.READGROUP.deamrates* #####
This file contains the MLE of the deamination rates. 
1. Methylation status (Methylated == 0; UnMethylated == 1)
2. DNA position along a DNA molecule (0-based)
3. Prime (5-prime == 0; 3-prime == 1)
4. Deamination rate
Kristian's avatar
Kristian committed
62

Kristian's avatar
Kristian committed
63
##### *CHR.READGROUP.BED.F* or *CHR.READGROUP.F* #####
Kristian's avatar
Kristian committed
64
Every row is a BED region or a genomic CpG. Every column output comes with a description.
Kristian's avatar
Kristian committed
65 66


Kristian's avatar
Kristian committed
67 68
## Simulating sequence data with methylation specific deamination patterns using [gargammel](https://github.com/grenaud/gargammel). ##

Kristian's avatar
Kristian committed
69

Kristian's avatar
Kristian committed
70 71 72
Along with the publication of DamMet, we also developed a new feature to [gargammel](https://github.com/grenaud/gargammel) that enables the user to simulate ancient DNA sequences with methylation specific deamination patterns. With this new feature, it has become possible to answer questions like, what is the accuracy with the current sequencing effort and what is the optimal trade off between genomic resolution (e.g. genomic window) and sequencing effort. 

## Citation ##
Kristian's avatar
Kristian committed
73

Kristian's avatar
Kristian committed
74
https://academic.oup.com/gigascience/article/8/4/giz025/5475519
Kristian's avatar
Kristian committed
75 76

# Troubleshooting #
Kristian's avatar
Kristian committed
77
## htslib installation issues ##
Kristian's avatar
Kristian committed
78 79
For one user installing the following software fixed some issues installing htslib:
`libbz2-1.0 libbz2-dev libbz2-ocaml libbz2-ocaml-dev liblzma-dev`.