Tags · Stephan Fuchs / prophane

Tags give the ability to mark specific points in history as being important

v6.2.6

fad9ba4f · bump to v6.2.6 · Oct 14, 2021

Release: v6.2.6
```
- fix fail of analysis if no sample groups were specified
```

v6.2.5

5f1eb059 · bump to v6.2.5 · Oct 13, 2021

Release: v6.2.5

- fix parsing of Proteome discoverer output that lacks a column named "Master" by calculating the mean value of the non-zero protein abundancies for all protein groups

v6.2.4

4921c494 · bump to v6.2.4 · Oct 01, 2021

Release: v6.2.4

- fix dbcan tasks: erroneous parsing of hmmer output resulted in many missed annotations. Previously the wrong columns (1/3) were parsed from hmmer results, which resulted in a lot of missed hits. The correct columns are 0 and 2 for hmmscan and hmmsearch respectively.
- internal: refactor building and writing of summary for comprehensibility.
- internal: remove second accession column from id2annot map files
- internal: rename column names of dbcan, pfam, tigrfam maps

v6.2.3

faf8b183 · bump to v6.2.3 · Sep 28, 2021

Release: v6.2.3

- fix tigrfam download links
- fix memory issues during lca calculation. Large numbers of protein groups in combination with many samples lead to large RAM consumption. Protein Group and Sample information is now stored in a sqlite db (`pgs/protein_groups_db.sql`), reducing the memory requirements to a minimum.
- fix order of summary columns: When using sample groups, the name and title of summary columns did not match (affected: quant and mafft columns). Krona plots were not affected.
- fix Proteome Discoverer Parsing for groups without master protein (#84). If a group has no master protein, it's abundance is set to the mean of all member protein abundance values
- fix emapper v4 analysis (set block size parameter only for emapper v5)

v6.2.2

de0934e8 · bump to v6.2.2 · Aug 20, 2021

Release: v6.2.2

- fix #82: prophane.de: eggnog jobs fail with memory error for some query fasta files
  - fixed by setting the block_size parameter based on the size of the all.faa fasta file: if the fasta file is larger than 10MB: block_size = 10 / size_in_MB(all.faa), rounded to one decimal. For smaller fasta files, the block_size is set to 2.

v6.2.1

f1a1100a · bump to v6.2.1 · Jul 28, 2021

Release: v6.2.1

- fix eggnog v5 download
- fix eggnog v5 result mapping
- fix emapper log redirection, now is properly written to {task_file_name}.log

v6.2

4889e765 · bump to v6.2 · Jul 09, 2021

Release: v6.2
```
- add support for eggnog database version 5.0.2
```

v6.1.1

775fa43a · bum to v6.1.1 · Jul 06, 2021

Release: v6.1.1

- fix issue 80: only download ncbi_taxdump database for taxonomic analyses
- fix mafft workflow if no protein groups with more than one accession are present
- fix prophane crash upon executing `prophane list-dbs` on outdated databases. DBs are now automatically migrated.
- add `prophane --version` parameter to cli
- doc: adapt installation instructions to include direct conda installation, remove setup.sh

v6.1

f02d4bc1 · bump to v6.1 · Jun 10, 2021

Release: v6.1

- add support for gzipped fasta files as input
- fix crash during parsing of Proteome Discoverer Output if it contains protein groups without any quantification values for the master protein. Now, Prophane ignores these protein groups.

v6.0.5

33b6edb1 · bump to v6.0.5 · Jun 04, 2021

Release: v6.0.5
```
add support for large mzIdentML files
```

v6.0.4

6d735f2c · bump to v6.0.4 · Jun 03, 2021

Release: v6.0.4

refactor code to parse search result using snakemake (previously: plain python)

v6.0.3

f1b7020c · bump to v6.0.3 · Jun 03, 2021

Release: v6.0.3

- fix command line interface option "prepare-dbs". It now accepts additional snakemake options and will work if the job config contains acc2annot_mapper tasks

v6.0.2

5c042660 · bump to v6.0.2 · May 28, 2021

Release: v6.0.2

- fix database migration for setups that contain multiple versions of the same db

v6.0.1

c1e596e4 · bump to v6.0.1 · May 20, 2021

Release: v6.0.1
```
- fix prophane path detection in CLI
```

v6.0

6dff995e · bump to v6.0 · May 19, 2021

Release: v6.0

** 6.0
*** Breaking Changes
change of command line interface:

    prophane -> prophane.py run
    prophane --list-dbs -> prophane.py list-dbs
    prophane --list-styles -> prophane.py list-styles

*** Features
Automatically download databases that are specified in the job-config. On first run of prophane, run prophane init {DB_DIR} where DB_DIR is an empty or non existant directory. To execute prophane, run prophane run {path-to-job-config}.

*** Changes
* bump the db_schema version to 5
* task and plot files now include the database version number instead of the md5sum: tasks/{annot_type}_annot_by_{tool}_on_{db_type}.v{db_version}.task{taskid}.{ext}

v5.1.1

9f56d99b · bump to v5.1.1 · May 10, 2021

Release: v5.1.1
```
- add mamba dependency
```
v5.1

91d08b0c · bump to v5.1 · Mar 30, 2021

Release: v5.1
```
- add parser for proteome discoverer output
```

v5.0.2

1c2bafd5 · bump to v5.0.2 · Mar 22, 2021

Release: v5.0.2

- fix mztab parser skipping samples without associated spectra and not counting all spectra

v5.0.1

1d8a2198 · bump to v5.0.1 · Feb 19, 2021

Release: v5.0.1

fix parsing of spectra IDs from protein group yaml and some test files

v5.0.0

2b3b4b0b · bump version to v5.0.0 · Feb 17, 2021

Release: v5.0.0

LCA determination of proteins with multiple annotations: If a species/lineage is found for all protein accessions in a protein group, this species/lineage is chosen as LCA. Before: if any other species was determined --> various.

LCA determination with two different methods:
1. (default) per protein group (with threshold, default: 1). If the ratio of proteins assigned to an ancestor is at least as high as the threshold, this ancestor is assigned to the entire group (multiple above threshold --> lca with highest support). Changes previous default behaviour. For example if a species/lineage is found for all protein accessions in one group, this species/lineage has a support of 100% and is chosen by the new lca-methode as LCA. Before: if any other species was determined --> various.
2. democratic: takes the one with highest occurrence among all protein groups in the whole task

New columns in summary.txt: lca-support, describing the number of proteins/spectra assigned to the respective LCA. Also the summary rule is adjusted for csv export, we only have to adjust the separator and includes spectra in output (if available).