Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • MIRA transcriptome assembly and isoforms

    Perhaps, just a quick question. I am in the process of figuring out what type of sequencing runs to do for a de novo transcriptome assembly effort for a couple different species. Previously, we had planned to do everything with 454 data, but some friends have been trying to convince me that paired-end Illumina runs are the a better way to go. For me, it seems a hybrid approach would be the best, but my main concern with that is finding an appropriate tool to analyze that data. I know that MIRA has been cited by many on this site as being a great hybrid assembly assembler, but I wonder how it does with predicting isoforms and such. It seems one of more complicated bits in this type of assembly may be in combining different rules for isoform determination. Does anyone out there have any experience in using these data types to make a reference transcriptome or any thoughts on what might be the best tool for this type of assembly? Also, has anyone use MIRA to do this type of assembly and have some insight into how it does with identifying isoforms?

    Cheers,
    Nate

  • #2
    Mira does not handle well the isoforms at all. It is design for bacterial genomes and it assumes that your sample is haploid and that there are no isoforms.
    This has two consequences:
    - It treats ESTs from different isoforms as chimeric.
    - It treats SNPs as variation in duplicated genes.
    Despite that problem I'm still using it for assembling diploid transcriptomes because there are no better tools out there for sanger+454 projects. I've tried newbler 2.3 and the result was even worse in that regard. I don't know if the newer newbler could be any better, but since it's not trivial to get a copy of newbler I've not tried it.
    This is a serious problem for me and I'd like to find a good solution.
    Last edited by Jose Blanca; 12-08-2010, 11:39 PM.

    Comment


    • #3
      Thanks for your thoughts, Jose. I have obtained a copy of the Newbler 2.5 and am currently using it to do some assembly. I'll let you know how I think it works out once I get a better handle on the results. Did you see the recent paper by Kumar and Blaxter in BMC Genomics that compares assemblyers? I'd be curious what you thought of it given your experience with a couple of these other assemblers. If you'd had similar experiences or not. Also, just to clarify, you were saying that the two consequences that you found in MIRA were worse in Newbler 2.3. Was that correct?

      Comment


      • #4
        I've found that you have to run several rounds of MIRA otherwise it under-assembles transcriptome data. So I pass the contigs and 'debris' sequences from the first assembly into a second assembly and so on, each time keeping the assembly parameters the same. It seems to take 4-5 rounds of repeated assembly before MIRA stops finding new contigs. Then I pass the contigs to CAP3 for a final round of clustering. Note that this isn't a criticism of MIRA and I suspect that this under-clustering is just a consequence of the uneven read coverage and sequence diversity of transcriptomes. I think MIRA is an incredibly flexible and useful tool!

        Comment


        • #5
          Originally posted by JueFish View Post
          I'll let you know how I think it works out once I get a better handle on the results.
          That would be very useful for me, thanks.

          Originally posted by JueFish View Post
          Did you see the recent paper by Kumar and Blaxter in BMC Genomics that compares assemblyers? I'd be curious what you thought of it given your experience with a couple of these other assemblers. If you'd had similar experiences or not.
          Yes, I read the paper. Unfortunately their analysis is quite swallow. The metrics they used only have into account the number and length of contigs. They ignore these problem that we're discussing.

          Originally posted by JueFish View Post
          Also, just to clarify, you were saying that the two consequences that you found in MIRA were worse in Newbler 2.3. Was that correct?
          Yes, that was with 2.3, if I get a copy of a newer version I will trying out again.

          The reassembling approach but using CAP3 could be a good one. Unfortunatelly in that way CAP3 does not have information about the coverage. Also I had serious problems with CAP3, it tends to crash and it is not maintained anymore.

          Comment


          • #6
            MIRA parameters for 454 data and hybrid (454+sanger)

            Originally posted by kirby View Post
            I've found that you have to run several rounds of MIRA otherwise it under-assembles transcriptome data. So I pass the contigs and 'debris' sequences from the first assembly into a second assembly and so on, each time keeping the assembly parameters the same. It seems to take 4-5 rounds of repeated assembly before MIRA stops finding new contigs. Then I pass the contigs to CAP3 for a final round of clustering. Note that this isn't a criticism of MIRA and I suspect that this under-clustering is just a consequence of the uneven read coverage and sequence diversity of transcriptomes. I think MIRA is an incredibly flexible and useful tool!
            Dear can you pass the MIRA parameters for 454 assembly of transcriptome. when I assemble I found consensus quality of 48 and few Strong unresolved repeat positions (SRMc) while Consensus bases with IUPAC is quite high. should we check those concensus case?

            thanks

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM
            • seqadmin
              Techniques and Challenges in Conservation Genomics
              by seqadmin



              The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

              Avian Conservation
              Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
              03-08-2024, 10:41 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 03-27-2024, 06:37 PM
            0 responses
            12 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-27-2024, 06:07 PM
            0 responses
            11 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-22-2024, 10:03 AM
            0 responses
            53 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-21-2024, 07:32 AM
            0 responses
            69 views
            0 likes
            Last Post seqadmin  
            Working...
            X