Multifetch should add appropriate eco code
Context
The multifetch.tsv file describes multiple sources a piece of information can be found in, prioritizing the left-most source available. Multifetch tool takes this file as configuration and checks all the sources by priority order and when the piece of information is found, it writes it to the output file.
What we need now is to add an evidence annotation in addition to the data itself using ECO:0000311.
Details
Given an attribute, instead of just writing to the tsv file the value of the highest priority source, we need to append an evidence annotation saying which source was used. The suffix to add is given in the following table:
Source | Suffix |
---|---|
BioProject | <ECO:0000311>{acc:bioproject@ciri} |
BioSample | <ECO:0000311>{acc:biosample@ciri} |
EMBL | <ECO:0000311>{acc:genbank@ciri} |
SRA | <ECO:0000311>{acc:sra@ciri} |
release_genome_meta.tsv | <ECO:0000311>{acc:gisaid@ciri} |
For instance if a piece of data is found in EMBL and the data is xyz
, then instead of just writing xyz
to tsv, we should write: xyz<ECO:0000311>{acc:genbank@ciri}
.
In addition, extra fields (the ones added to the right o the tsv should also contain the evidence suffix).