Hi all,
I am working on assembling the mitochondrial genome of my insect parasitoid, Microctonus aethiopoides. There is a reference mitochondrial genome of Microctonus brassicae available in NCBI (accession number: OU953852) with a size of 40kbp. Using this as a reference, I assembled the Nanopore long-read data with Flye in meta mode and obtained a contig of 31kbp with coverage >1000x (as in image), and the assembly is circular according to the Flye info file. (Note : Used this method because non of the mito genome assembly tools (Novoplasty, Mitofinder, Mitoz, mitoflow) could produce a circular contig)
For additional confirmation, I blasted this contig against the nucleotide database (ntdb) and got max hits for mitochondrial genomes, confirming that it is indeed the mitochondrial genome.
Here are my questions:
Is it common for insect mitochondrial genomes to be as large as 30-40kbp? I was initially shocked to see the reference in NCBI with 40kbp, as I understood that mitochondrial genomes typically range from 15-25kbp.
Could the large size be due to many repeat regions? When visualizing the mapped regions in IGV, I noticed many reads with low mapping quality, which might indicate repeat regions. How should I deal with these repeats?
Annotation issues: After annotating this assembled genome, it annotated mitochondrial genes but also many OH (origin of replication) regions. Is this normal, and how should I address this in my analysis?
Any suggestions or insights would be greatly appreciated!
I am working on assembling the mitochondrial genome of my insect parasitoid, Microctonus aethiopoides. There is a reference mitochondrial genome of Microctonus brassicae available in NCBI (accession number: OU953852) with a size of 40kbp. Using this as a reference, I assembled the Nanopore long-read data with Flye in meta mode and obtained a contig of 31kbp with coverage >1000x (as in image), and the assembly is circular according to the Flye info file. (Note : Used this method because non of the mito genome assembly tools (Novoplasty, Mitofinder, Mitoz, mitoflow) could produce a circular contig)
For additional confirmation, I blasted this contig against the nucleotide database (ntdb) and got max hits for mitochondrial genomes, confirming that it is indeed the mitochondrial genome.
Here are my questions:
Is it common for insect mitochondrial genomes to be as large as 30-40kbp? I was initially shocked to see the reference in NCBI with 40kbp, as I understood that mitochondrial genomes typically range from 15-25kbp.
Could the large size be due to many repeat regions? When visualizing the mapped regions in IGV, I noticed many reads with low mapping quality, which might indicate repeat regions. How should I deal with these repeats?
Annotation issues: After annotating this assembled genome, it annotated mitochondrial genes but also many OH (origin of replication) regions. Is this normal, and how should I address this in my analysis?
Any suggestions or insights would be greatly appreciated!