Unconfigured Ad

**Jose Blanca** · 12-08-2010, 11:37 PM

Mira does not handle well the isoforms at all. It is design for bacterial genomes and it assumes that your sample is haploid and that there are no isoforms.
This has two consequences:
- It treats ESTs from different isoforms as chimeric.
- It treats SNPs as variation in duplicated genes.
Despite that problem I'm still using it for assembling diploid transcriptomes because there are no better tools out there for sanger+454 projects. I've tried newbler 2.3 and the result was even worse in that regard. I don't know if the newer newbler could be any better, but since it's not trivial to get a copy of newbler I've not tried it.
This is a serious problem for me and I'd like to find a good solution.

**JueFish** · 12-10-2010, 10:38 AM

Thanks for your thoughts, Jose. I have obtained a copy of the Newbler 2.5 and am currently using it to do some assembly. I'll let you know how I think it works out once I get a better handle on the results. Did you see the recent paper by Kumar and Blaxter in BMC Genomics that compares assemblyers? I'd be curious what you thought of it given your experience with a couple of these other assemblers. If you'd had similar experiences or not. Also, just to clarify, you were saying that the two consequences that you found in MIRA were worse in Newbler 2.3. Was that correct?

**kirby** · 12-10-2010, 11:08 PM

I've found that you have to run several rounds of MIRA otherwise it under-assembles transcriptome data. So I pass the contigs and 'debris' sequences from the first assembly into a second assembly and so on, each time keeping the assembly parameters the same. It seems to take 4-5 rounds of repeated assembly before MIRA stops finding new contigs. Then I pass the contigs to CAP3 for a final round of clustering. Note that this isn't a criticism of MIRA and I suspect that this under-clustering is just a consequence of the uneven read coverage and sequence diversity of transcriptomes. I think MIRA is an incredibly flexible and useful tool!

**Jose Blanca** · 12-12-2010, 05:21 AM

Originally posted by JueFish View Post

I'll let you know how I think it works out once I get a better handle on the results.

That would be very useful for me, thanks.

Originally posted by JueFish View Post

Did you see the recent paper by Kumar and Blaxter in BMC Genomics that compares assemblyers? I'd be curious what you thought of it given your experience with a couple of these other assemblers. If you'd had similar experiences or not.

Yes, I read the paper. Unfortunately their analysis is quite swallow. The metrics they used only have into account the number and length of contigs. They ignore these problem that we're discussing.

Originally posted by JueFish View Post

Also, just to clarify, you were saying that the two consequences that you found in MIRA were worse in Newbler 2.3. Was that correct?

Yes, that was with 2.3, if I get a copy of a newer version I will trying out again.

The reassembling approach but using CAP3 could be a good one. Unfortunatelly in that way CAP3 does not have information about the coverage. Also I had serious problems with CAP3, it tends to crash and it is not maintained anymore.

**sarwar** · 02-02-2011, 09:34 PM

MIRA parameters for 454 data and hybrid (454+sanger)

Originally posted by kirby View Post

I've found that you have to run several rounds of MIRA otherwise it under-assembles transcriptome data. So I pass the contigs and 'debris' sequences from the first assembly into a second assembly and so on, each time keeping the assembly parameters the same. It seems to take 4-5 rounds of repeated assembly before MIRA stops finding new contigs. Then I pass the contigs to CAP3 for a final round of clustering. Note that this isn't a criticism of MIRA and I suspect that this under-clustering is just a consequence of the uneven read coverage and sequence diversity of transcriptomes. I think MIRA is an incredibly flexible and useful tool!

Dear can you pass the MIRA parameters for 454 assembly of transcriptome. when I assemble I found consensus quality of 48 and few Strong unresolved repeat positions (SRMc) while Consensus bases with IUPAC is quite high. should we check those concensus case?

thanks

Topics	Statistics	Last Post
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 34 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM
Sequencing the Two-Toed Sloth Genome Reveals Jumping Genes Tied to Its Extreme Metabolism by SEQadmin2 Started by SEQadmin2, 06-09-2026, 11:58 AM	0 responses 97 views 0 reactions	Last Post by SEQadmin2 06-09-2026, 11:58 AM
A New Method Makes Hantavirus Genome Analysis Faster and More Accessible by SEQadmin2 Started by SEQadmin2, 06-05-2026, 10:09 AM	0 responses 117 views 0 reactions	Last Post by SEQadmin2 06-05-2026, 10:09 AM
A New Single-Cell Method Maps DNA-Protein Interactions by SEQadmin2 Started by SEQadmin2, 06-04-2026, 08:59 AM	0 responses 112 views 0 reactions	Last Post by SEQadmin2 06-04-2026, 08:59 AM

Unconfigured Ad

MIRA transcriptome assembly and isoforms

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News