Error in Assembled Mitochondrial Genome Output for Some Metagenomes #226

jpinus · 2025-01-10T10:59:29Z

on which platform/server? (Windows? Windows Sublinux? MacOS? Ubuntu? etc.)

linux HPC

MitoZ version?

mitoz 3.6

How did you install MitoZ? (e.g. Docker, Udocker, Singularity, Conda-Pack, Conda, or source code)

conda

Did you run a test after your installation, and was the test run okay?

sure

How much data (roughly) did you use for mitogenome assembly? e.g. 5Gbp?

assembled mitochondrial genomes via NOVOPlasty

The command you used?

mitoz annotate --outprefix ${sample} --fastafiles simplified_${sample}.fasta --thread_number 20 --clade Annelida-segmented-worms

Problem description

I'm currently using MitoZ to assemble mitochondrial genomes from a set of 155 metagenomes. While the assembly worked perfectly for 140 of the metagenomes, I’m encountering an issue with the remaining samples. Even for those where the mitochondrial genomes were circularized, the assembly is incomplete or contains errors, such as misaligned or missing genes.

I've checked my input data for quality, and there don't seem to be any issues with it. I'd appreciate any guidance on how to resolve this.
Out of the 155 metagenomes, 140 worked perfectly, but the others, including some with circularized mitochondrial genomes, are giving inconsistent or incorrect results.

Log messages from MitoZ (stdout and stderr, e.g., both `m.log` and `m.err` files)

2025-01-10 10:32:22,691 - mitoz.utility.utility - INFO -
combine_annotations_and_find_control_region() chdir to /opt/extern/bremen/symbiosis/jkiefer/P6960/04_mtDNA/anno/6960_AU/tmp_6960_AU_simplified_6960_AU.fasta_mitoscaf.fa
Traceback (most recent call last):
File "/opt/share/software/packages/mitoz-3.6/conda-env/bin/mitoz", line 10, in
sys.exit(main())
File "/opt/share/software/packages/mitoz-3.6/conda-env/lib/python3.8/site-packages/mitoz/MitoZ.py", line 99, in main
args.func(args)
File "/opt/share/software/packages/mitoz-3.6/conda-env/lib/python3.8/site-packages/mitoz/annotate/annotation.py", line 680, in main
tbl_file, errorsummary_val_file, tbl2asn_gbf, summary_file = combine_annotations_and_find_control_region(
File "/opt/share/software/packages/mitoz-3.6/conda-env/lib/python3.8/site-packages/mitoz/annotate/annotation.py", line 412, in combine_annotations_and_find_control_region
if file_not_empty(mt_file_cdsft):
File "/opt/share/software/packages/mitoz-3.6/conda-env/lib/python3.8/site-packages/mitoz/utility/utility.py", line 55, in file_not_empty
if os.stat(file).st_size > 0:
FileNotFoundError: [Errno 2] No such file or directory: '6960_AU_simplified_6960_AU.fasta_mitoscaf.fa.cds.ft'

The text was updated successfully, but these errors were encountered:

linzhi2013 · 2025-01-12T11:49:42Z

Hi, you are annotating mitochondrial genomes using MitoZ, instead of assembling.

I do not know what was going on there based on the log you provided. But if there are some unannotated PCGs, you can extend the database (https://github.com/linzhi2013/MitoZ/wiki/Extending-MitoZ-s-database)

Best

jpinus · 2025-01-12T16:51:54Z

@linzhi2013 yes, because the coverage for the assembling is not high enough and always failed. That's why I do the assembly with NOVOPlast and the annotation of the assembled mitochondrial genome with mitoZ - which worked with exactly the same script for >140 mitochondrial genomes, but unfortunately failed in 6 cases. And the error message for these six cases is always that this one specific file was not found.
I also used mitos2 for annotation and it worked. So the genes are all there, the genomes are complete.

linzhi2013 · 2025-01-12T22:47:53Z

The *_mitoscaf.fa.cds.ft file was not generated, so no PCGs were annotated. (1) the seq id cannot be too long; (2) these species are too divergent from the PCG database.

jpinus · 2025-01-13T09:45:35Z

@linzhi2013 but *_mitoscaf.fa.cds.position and *_mitoscaf.fa.cds.position.sorted were generated:

cat mtDNA_mitoscaf.fa.cds.position.sorted
mtDNA COX3 259 1 259 1124 1900 +
mtDNA ATP6 228 1 226 2407 3075 +
mtDNA ND3 117 5 117 3093 3419 +
mtDNA ND6 155 1 154 3422 3877 +
mtDNA ND5 569 12 453 3950 5638 +
mtDNA ND4L 98 23 88 5816 6112 +
mtDNA ND4 447 104 412 6112 7449 +
mtDNA ND2 335 5 334 8654 9625 +
mtDNA ND1 308 22 302 9971 10882 +
mtDNA COX2 228 1 228 11864 12547 +
mtDNA COX1 510 1 505 13955 15484 +
mtDNA ATP8 49 1 44 15593 15748 +
mtDNA CYTB 379 3 364 15971 17059 +

so mitoZ is annotating - these are the genes I want and need for mitoz-tools group_seq_by_gene
of course I could go manuelly there and extract the genes, but i don't get why the pipeline is crashing.

here are all the files which are located in the tmp_*_mitoscaf.fa dir:
*_mitoscaf.fa
*_mitoscaf.fa.cds.position
*_mitoscaf.fa.cds.position.sorted
*_mitoscaf.fa.l-rRNA.ft
*_mitoscaf.fa.l-rRNA.out
*_mitoscaf.fa.l-rRNA.tbl
*_mitoscaf.fa.most_related_species.txt
*_mitoscaf.fa.njs
*_mitoscaf.fa.s-rRNA.ft
*_mitoscaf.fa.s-rRNA.out
*_mitoscaf.fa.s-rRNA.tbl
*_mitoscaf.fa.solar.genewise.gff.cds.position.cds
*_mitoscaf.fa.solar.genewise.gff.cds.position.cds.taxa
*_mitoscaf.fa.solar.genewise.gff.pep
*_mitoscaf.fa.trna
*_mitoscaf.fa.trna.ft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error in Assembled Mitochondrial Genome Output for Some Metagenomes #226

Error in Assembled Mitochondrial Genome Output for Some Metagenomes #226

jpinus commented Jan 10, 2025

linzhi2013 commented Jan 12, 2025

jpinus commented Jan 12, 2025

linzhi2013 commented Jan 12, 2025

jpinus commented Jan 13, 2025

Error in Assembled Mitochondrial Genome Output for Some Metagenomes #226

Error in Assembled Mitochondrial Genome Output for Some Metagenomes #226

Comments

jpinus commented Jan 10, 2025

on which platform/server? (Windows? Windows Sublinux? MacOS? Ubuntu? etc.)

MitoZ version?

How did you install MitoZ? (e.g. Docker, Udocker, Singularity, Conda-Pack, Conda, or source code)

Did you run a test after your installation, and was the test run okay?

How much data (roughly) did you use for mitogenome assembly? e.g. 5Gbp?

The command you used?

Problem description

Log messages from MitoZ (stdout and stderr, e.g., both m.log and m.err files)

linzhi2013 commented Jan 12, 2025

jpinus commented Jan 12, 2025

linzhi2013 commented Jan 12, 2025

jpinus commented Jan 13, 2025

Log messages from MitoZ (stdout and stderr, e.g., both `m.log` and `m.err` files)