Dear assemblers,
I'm trying to assemble the complete mtDNA genome of a Drosophila species using pair-ends transcriptome reads. To do so I'm using MIRA 4 with MITObim 1.7 following the protocol described here (https://github.com/chrishah/MITObim).
As first step I aligned all reads with bowtie to all available complete mtDNAs from Drosophila species and I took the highest mapped mt as the backbone for MIRA as considered the most closely related having higher fraction of conserved regions.
I start from 140M reads in paired-ends (140M+140M) too much. MIRA give me clear warning, too much coverage. So I'm splitting reads in blocks in order to get a better coverage. I have few questions, maybe you could help me.
- What is the best coverage I should search for?
- Does It make sense to perform x reconstruction, each of them with different subsets of reads and take the consensus of the alined de-novo genome?
- From a preliminary analysis I used the 50% of reads. MITObin starts the iterative process and it stops at the iteration #2. Then I aligned iteration #1, #2 with the backbone and other 2 complete mtDNAs. The iteration #2 looks much worst reconstructed than #1, with a lot on long gaps. Is it normal and should I take it anyway as best de-novo assembly?
- How important is the backbone, how much is its bias in the genome reconstruction?
Thank you so much for your help
Francesco
I'm trying to assemble the complete mtDNA genome of a Drosophila species using pair-ends transcriptome reads. To do so I'm using MIRA 4 with MITObim 1.7 following the protocol described here (https://github.com/chrishah/MITObim).
As first step I aligned all reads with bowtie to all available complete mtDNAs from Drosophila species and I took the highest mapped mt as the backbone for MIRA as considered the most closely related having higher fraction of conserved regions.
I start from 140M reads in paired-ends (140M+140M) too much. MIRA give me clear warning, too much coverage. So I'm splitting reads in blocks in order to get a better coverage. I have few questions, maybe you could help me.
- What is the best coverage I should search for?
- Does It make sense to perform x reconstruction, each of them with different subsets of reads and take the consensus of the alined de-novo genome?
- From a preliminary analysis I used the 50% of reads. MITObin starts the iterative process and it stops at the iteration #2. Then I aligned iteration #1, #2 with the backbone and other 2 complete mtDNAs. The iteration #2 looks much worst reconstructed than #1, with a lot on long gaps. Is it normal and should I take it anyway as best de-novo assembly?
- How important is the backbone, how much is its bias in the genome reconstruction?
Thank you so much for your help
Francesco
Comment