Low BUSCO score for almost complete bacterial genome
Dear Mathieu,
I am running BUSCO 4.1.4 on a genome assembly generated from bacteria (Klebseilla pneumonia - genome size 5.8 MB + plasmids). The contiguity stats suggests the assembly is nearly complete (Max contig len: 5.4M, Total contigs: 5, N50:5416111).
I tried BUCSO with gammaproteobacteria_odb10 but only 12.8% of complete BUSCOs were observed. I understand that there could be misassemblies, etc. which can cause this but when I BLAST the contigs against the Klebseilla genome it gives full length hits to its genes. I also tried a couple of different lineages like enterobacterales_odb10 and bacteria_odb10 but the results doesn't change much. Here is my command for your reference:
busco -i wtdbg2.ctg.fa -o busco-wt -m genome -l ./gammaproteobacteria_odb10 -f --config ./config.ini
The logs are given below:
INFO: ***** Start a BUSCO v4.1.4 analysis, current time: 10/15/2020 09:44:58 *****
INFO: Configuring BUSCO with k_pneumonia/config.ini
INFO: Mode is genome
INFO: 'Force' option selected; overwriting previous results directory
INFO: Input file is wtdbg2.ctg.fa
INFO: Downloading information on latest versions of BUSCO data...
INFO: Using local lineages directory gammaproteobacteria_odb10
INFO: Running BUSCO using lineage dataset gammaproteobacteria_odb10 (prokaryota, 2020-03-06)
INFO: ***** Run Prodigal on input to predict and extract genes *****
INFO: Running Prodigal with genetic code 11 in single mode
INFO: Running 1 job(s) on prodigal, starting at 10/15/2020 09:45:02
INFO: [prodigal] 1 of 1 task(s) completed
INFO: Genetic code 11 selected as optimal
INFO: ***** Run HMMER on gene sequences *****
INFO: Running 366 job(s) on hmmsearch, starting at 10/15/2020 09:45:17
INFO: [hmmsearch] 37 of 366 task(s) completed
INFO: [hmmsearch] 74 of 366 task(s) completed
INFO: [hmmsearch] 110 of 366 task(s) completed
INFO: [hmmsearch] 147 of 366 task(s) completed
INFO: [hmmsearch] 184 of 366 task(s) completed
INFO: [hmmsearch] 220 of 366 task(s) completed
INFO: [hmmsearch] 257 of 366 task(s) completed
INFO: [hmmsearch] 293 of 366 task(s) completed
INFO: [hmmsearch] 330 of 366 task(s) completed
INFO: [hmmsearch] 366 of 366 task(s) completed
INFO: Results: C:12.8%[S:12.8%,D:0.0%],F:51.1%,M:36.1%,n:366
--------------------------------------------------
|Results from dataset gammaproteobacteria_odb10 |
--------------------------------------------------
|C:12.8%[S:12.8%,D:0.0%],F:51.1%,M:36.1%,n:366 |
|47 Complete BUSCOs (C) |
|47 Complete and single-copy BUSCOs (S) |
|0 Complete and duplicated BUSCOs (D) |
|187 Fragmented BUSCOs (F) |
|132 Missing BUSCOs (M) |
|366 Total BUSCO groups searched |
--------------------------------------------------
INFO: BUSCO analysis done. Total running time: 58 seconds
INFO: Results written in k_pneumonia/busco-wt
Please could you advise?
Many thanks in advance. Urmi