Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • JueFish
    Member
    • May 2010
    • 42

    MIRA transcriptome assembly and isoforms

    Perhaps, just a quick question. I am in the process of figuring out what type of sequencing runs to do for a de novo transcriptome assembly effort for a couple different species. Previously, we had planned to do everything with 454 data, but some friends have been trying to convince me that paired-end Illumina runs are the a better way to go. For me, it seems a hybrid approach would be the best, but my main concern with that is finding an appropriate tool to analyze that data. I know that MIRA has been cited by many on this site as being a great hybrid assembly assembler, but I wonder how it does with predicting isoforms and such. It seems one of more complicated bits in this type of assembly may be in combining different rules for isoform determination. Does anyone out there have any experience in using these data types to make a reference transcriptome or any thoughts on what might be the best tool for this type of assembly? Also, has anyone use MIRA to do this type of assembly and have some insight into how it does with identifying isoforms?

    Cheers,
    Nate
  • Jose Blanca
    Member
    • Aug 2009
    • 70

    #2
    Mira does not handle well the isoforms at all. It is design for bacterial genomes and it assumes that your sample is haploid and that there are no isoforms.
    This has two consequences:
    - It treats ESTs from different isoforms as chimeric.
    - It treats SNPs as variation in duplicated genes.
    Despite that problem I'm still using it for assembling diploid transcriptomes because there are no better tools out there for sanger+454 projects. I've tried newbler 2.3 and the result was even worse in that regard. I don't know if the newer newbler could be any better, but since it's not trivial to get a copy of newbler I've not tried it.
    This is a serious problem for me and I'd like to find a good solution.
    Last edited by Jose Blanca; 12-08-2010, 11:39 PM.

    Comment

    • JueFish
      Member
      • May 2010
      • 42

      #3
      Thanks for your thoughts, Jose. I have obtained a copy of the Newbler 2.5 and am currently using it to do some assembly. I'll let you know how I think it works out once I get a better handle on the results. Did you see the recent paper by Kumar and Blaxter in BMC Genomics that compares assemblyers? I'd be curious what you thought of it given your experience with a couple of these other assemblers. If you'd had similar experiences or not. Also, just to clarify, you were saying that the two consequences that you found in MIRA were worse in Newbler 2.3. Was that correct?

      Comment

      • kirby
        Junior Member
        • Sep 2009
        • 3

        #4
        I've found that you have to run several rounds of MIRA otherwise it under-assembles transcriptome data. So I pass the contigs and 'debris' sequences from the first assembly into a second assembly and so on, each time keeping the assembly parameters the same. It seems to take 4-5 rounds of repeated assembly before MIRA stops finding new contigs. Then I pass the contigs to CAP3 for a final round of clustering. Note that this isn't a criticism of MIRA and I suspect that this under-clustering is just a consequence of the uneven read coverage and sequence diversity of transcriptomes. I think MIRA is an incredibly flexible and useful tool!

        Comment

        • Jose Blanca
          Member
          • Aug 2009
          • 70

          #5
          Originally posted by JueFish View Post
          I'll let you know how I think it works out once I get a better handle on the results.
          That would be very useful for me, thanks.

          Originally posted by JueFish View Post
          Did you see the recent paper by Kumar and Blaxter in BMC Genomics that compares assemblyers? I'd be curious what you thought of it given your experience with a couple of these other assemblers. If you'd had similar experiences or not.
          Yes, I read the paper. Unfortunately their analysis is quite swallow. The metrics they used only have into account the number and length of contigs. They ignore these problem that we're discussing.

          Originally posted by JueFish View Post
          Also, just to clarify, you were saying that the two consequences that you found in MIRA were worse in Newbler 2.3. Was that correct?
          Yes, that was with 2.3, if I get a copy of a newer version I will trying out again.

          The reassembling approach but using CAP3 could be a good one. Unfortunatelly in that way CAP3 does not have information about the coverage. Also I had serious problems with CAP3, it tends to crash and it is not maintained anymore.

          Comment

          • sarwar
            Member
            • Apr 2010
            • 14

            #6
            MIRA parameters for 454 data and hybrid (454+sanger)

            Originally posted by kirby View Post
            I've found that you have to run several rounds of MIRA otherwise it under-assembles transcriptome data. So I pass the contigs and 'debris' sequences from the first assembly into a second assembly and so on, each time keeping the assembly parameters the same. It seems to take 4-5 rounds of repeated assembly before MIRA stops finding new contigs. Then I pass the contigs to CAP3 for a final round of clustering. Note that this isn't a criticism of MIRA and I suspect that this under-clustering is just a consequence of the uneven read coverage and sequence diversity of transcriptomes. I think MIRA is an incredibly flexible and useful tool!
            Dear can you pass the MIRA parameters for 454 assembly of transcriptome. when I assemble I found consensus quality of 48 and few Strong unresolved repeat positions (SRMc) while Consensus bases with IUPAC is quite high. should we check those concensus case?

            thanks

            Comment

            Latest Articles

            Collapse

            • SEQadmin2
              Nine Things a Sample Prep Scientist Thinks About Before Sequencing
              by SEQadmin2


              I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

              Here are nine questions we think about, in roughly the order they matter, before...
              06-18-2026, 07:11 AM
            • SEQadmin2
              From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
              by SEQadmin2


              Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


              The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
              ...
              06-02-2026, 10:05 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by SEQadmin2, 06-17-2026, 06:09 AM
            0 responses
            34 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-09-2026, 11:58 AM
            0 responses
            97 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-05-2026, 10:09 AM
            0 responses
            117 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-04-2026, 08:59 AM
            0 responses
            112 views
            0 reactions
            Last Post SEQadmin2  
            Working...