SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Problems using NexteraXT with mitochondrial DNA MicroBio Illumina/Solexa 11 06-19-2014 02:51 AM
Problems when assembling the transcripts and count reads lucer105 Bioinformatics 1 12-09-2013 02:10 PM
how to analyze whole genome sequencing data (new genome assembling)? metheuse Genomic Resequencing 2 04-19-2013 05:06 PM
alignment and reference genome assembling slny Bioinformatics 3 03-16-2011 06:49 AM
circular genome assembling mingkunli Bioinformatics 0 04-06-2010 07:32 AM

Reply
 
Thread Tools
Old 08-18-2015, 06:29 AM   #1
Ramprasad
Junior Member
 
Location: Bangalore

Join Date: Jun 2011
Posts: 7
Default Problems in assembling mitochondrial genome

Hi all,

I have assembled a genome (from illumina data) for a non-model species using allpaths and I'm searching for the mitochondria. I have a list of proteins (13 in total) that should be on mitochondria and have been unable to locate them in any sensible manner in this assembly(used the proteins with exonerate, spaln, blast and blat).

For example, blasting these proteins against the assembled genome reveals only one protein and I'm getting similar results with other approaches. I want to know why they are not showing up. I expect that the mitochondria should be assembled in a single contig (about 20kb in length) but itís puzzling not even fragments are showing up.

Has anyone ever run into this problem with their assembly? Or does anyone have any idea what is going on here?

Any help would be appreciated.

Thanks and regards,
Ram
Ramprasad is offline   Reply With Quote
Old 08-18-2015, 08:49 AM   #2
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,543
Default

Did you do any pre-filtering? The mitochondrial reads might have been lost (e.g. due to much higher coverage, or very different %GC).
maubp is offline   Reply With Quote
Old 08-18-2015, 11:40 PM   #3
colindaven
Senior Member
 
Location: Germany

Join Date: Oct 2008
Posts: 415
Default

Hi,

recently had a project like this for some plants.

Observations
1) mitochondria are NOT easy to assemble. Expect 100+ contigs, perhaps ~400, with Illumina data. I had pacbio data which did not assemble to one contig, more like 30-60.

2) mitochondria are highly variable in size, but as far as I know none are 20kb. See for example
http://www.ncbi.nlm.nih.gov/genomes/...&opt=organelle

3) as maubp suggested perhaps take these 13 genes as a nt fasta and map raw reads against them prior to assembly. Are they covered ?

Maybe the data quality isn't good enough.

cheers,
Colin
colindaven is offline   Reply With Quote
Old 08-19-2015, 01:33 AM   #4
maubp
Peter (Biopython etc)
 
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,543
Default

Re (3), I didn't explicitly suggest that, but its a good idea worth trying. You could also try mapping against mitochondrial sequences from the closest published relatives.
maubp is offline   Reply With Quote
Old 08-19-2015, 09:24 AM   #5
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Quote:
Originally Posted by maubp View Post
You could also try mapping against mitochondrial sequences from the closest published relatives.
I second this; you might be able to grab most of the mito reads that way. Alternately, I suggest you try normalizing the data prior to assembly, to drop the mito coverage down to a level similar to the rest of the genome - that makes it much easier to assemble, and typically yields a superior assembly for things with extremely high coverage.
Brian Bushnell is offline   Reply With Quote
Old 08-19-2015, 11:03 PM   #6
Linnea
Member
 
Location: Uppsala, Sweden

Join Date: Mar 2010
Posts: 23
Default

I also agree, first try to get the mitochondrial reads.

We had exactly thee same problem when assembling our non-model organism (1Gb, mitochondrion ~16kb), and didn't get any mitochondrial contigs at all. It turned out that the tissue we used was so full of mitochondria that the read coverage was just too high for the assembler to handle it.

Instad, we mapped all reads to the closest mithochondrion we could find, and extracted a consensus from our mapped reads. Then we mapped our reads again against the consensus, corrected it, mapped again and continued in an iterative manner until the reads and the consensus matched perfectly. I think the software MITObim (https://github.com/chrishah/MITObim) works in a similar way (it wasn't released when we did this so I haven't tried it myself).

Good luck!
Linnea is offline   Reply With Quote
Old 08-20-2015, 11:27 PM   #7
colindaven
Senior Member
 
Location: Germany

Join Date: Oct 2008
Posts: 415
Default

Interesting. Good suggestions everybody.

Just stumbled across this program as well, which looks decent:

http://pythonhosted.org/ORG.asm/index.html
colindaven is offline   Reply With Quote
Reply

Tags
genome assembly, mitochondrial genome

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 11:52 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO