SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Mira assembly jmpi Bioinformatics 12 02-16-2015 07:47 PM
Mira assembly -shell script robelb4 Bioinformatics 2 07-21-2011 06:57 AM
MIRA assembly with MID barcodes for 454? raw937 Bioinformatics 4 06-14-2011 11:54 PM
MIRA assembly with MID barcodes for 454? raw937 454 Pyrosequencing 1 06-13-2011 11:52 AM
mira parameters for SNP search in transcriptome de novo macoma 454 Pyrosequencing 0 02-03-2010 04:50 AM

Reply
 
Thread Tools
Old 12-03-2010, 08:40 AM   #1
JueFish
Member
 
Location: Connecticut

Join Date: May 2010
Posts: 42
Default MIRA transcriptome assembly and isoforms

Perhaps, just a quick question. I am in the process of figuring out what type of sequencing runs to do for a de novo transcriptome assembly effort for a couple different species. Previously, we had planned to do everything with 454 data, but some friends have been trying to convince me that paired-end Illumina runs are the a better way to go. For me, it seems a hybrid approach would be the best, but my main concern with that is finding an appropriate tool to analyze that data. I know that MIRA has been cited by many on this site as being a great hybrid assembly assembler, but I wonder how it does with predicting isoforms and such. It seems one of more complicated bits in this type of assembly may be in combining different rules for isoform determination. Does anyone out there have any experience in using these data types to make a reference transcriptome or any thoughts on what might be the best tool for this type of assembly? Also, has anyone use MIRA to do this type of assembly and have some insight into how it does with identifying isoforms?

Cheers,
Nate
JueFish is offline   Reply With Quote
Old 12-08-2010, 10:37 PM   #2
Jose Blanca
Member
 
Location: Valencia, Spain

Join Date: Aug 2009
Posts: 70
Default

Mira does not handle well the isoforms at all. It is design for bacterial genomes and it assumes that your sample is haploid and that there are no isoforms.
This has two consequences:
- It treats ESTs from different isoforms as chimeric.
- It treats SNPs as variation in duplicated genes.
Despite that problem I'm still using it for assembling diploid transcriptomes because there are no better tools out there for sanger+454 projects. I've tried newbler 2.3 and the result was even worse in that regard. I don't know if the newer newbler could be any better, but since it's not trivial to get a copy of newbler I've not tried it.
This is a serious problem for me and I'd like to find a good solution.

Last edited by Jose Blanca; 12-08-2010 at 10:39 PM.
Jose Blanca is offline   Reply With Quote
Old 12-10-2010, 09:38 AM   #3
JueFish
Member
 
Location: Connecticut

Join Date: May 2010
Posts: 42
Default

Thanks for your thoughts, Jose. I have obtained a copy of the Newbler 2.5 and am currently using it to do some assembly. I'll let you know how I think it works out once I get a better handle on the results. Did you see the recent paper by Kumar and Blaxter in BMC Genomics that compares assemblyers? I'd be curious what you thought of it given your experience with a couple of these other assemblers. If you'd had similar experiences or not. Also, just to clarify, you were saying that the two consequences that you found in MIRA were worse in Newbler 2.3. Was that correct?
JueFish is offline   Reply With Quote
Old 12-10-2010, 10:08 PM   #4
kirby
Junior Member
 
Location: LA

Join Date: Sep 2009
Posts: 3
Default

I've found that you have to run several rounds of MIRA otherwise it under-assembles transcriptome data. So I pass the contigs and 'debris' sequences from the first assembly into a second assembly and so on, each time keeping the assembly parameters the same. It seems to take 4-5 rounds of repeated assembly before MIRA stops finding new contigs. Then I pass the contigs to CAP3 for a final round of clustering. Note that this isn't a criticism of MIRA and I suspect that this under-clustering is just a consequence of the uneven read coverage and sequence diversity of transcriptomes. I think MIRA is an incredibly flexible and useful tool!
kirby is offline   Reply With Quote
Old 12-12-2010, 04:21 AM   #5
Jose Blanca
Member
 
Location: Valencia, Spain

Join Date: Aug 2009
Posts: 70
Default

Quote:
Originally Posted by JueFish View Post
I'll let you know how I think it works out once I get a better handle on the results.
That would be very useful for me, thanks.

Quote:
Originally Posted by JueFish View Post
Did you see the recent paper by Kumar and Blaxter in BMC Genomics that compares assemblyers? I'd be curious what you thought of it given your experience with a couple of these other assemblers. If you'd had similar experiences or not.
Yes, I read the paper. Unfortunately their analysis is quite swallow. The metrics they used only have into account the number and length of contigs. They ignore these problem that we're discussing.

Quote:
Originally Posted by JueFish View Post
Also, just to clarify, you were saying that the two consequences that you found in MIRA were worse in Newbler 2.3. Was that correct?
Yes, that was with 2.3, if I get a copy of a newer version I will trying out again.

The reassembling approach but using CAP3 could be a good one. Unfortunatelly in that way CAP3 does not have information about the coverage. Also I had serious problems with CAP3, it tends to crash and it is not maintained anymore.
Jose Blanca is offline   Reply With Quote
Old 02-02-2011, 08:34 PM   #6
sarwar
Member
 
Location: delhi , india

Join Date: Apr 2010
Posts: 14
Default MIRA parameters for 454 data and hybrid (454+sanger)

Quote:
Originally Posted by kirby View Post
I've found that you have to run several rounds of MIRA otherwise it under-assembles transcriptome data. So I pass the contigs and 'debris' sequences from the first assembly into a second assembly and so on, each time keeping the assembly parameters the same. It seems to take 4-5 rounds of repeated assembly before MIRA stops finding new contigs. Then I pass the contigs to CAP3 for a final round of clustering. Note that this isn't a criticism of MIRA and I suspect that this under-clustering is just a consequence of the uneven read coverage and sequence diversity of transcriptomes. I think MIRA is an incredibly flexible and useful tool!
Dear can you pass the MIRA parameters for 454 assembly of transcriptome. when I assemble I found consensus quality of 48 and few Strong unresolved repeat positions (SRMc) while Consensus bases with IUPAC is quite high. should we check those concensus case?

thanks
sarwar is offline   Reply With Quote
Reply

Tags
454, hybrid assembly, illumina, isoforms, mira

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:57 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO