Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • MIRA transcriptome assembly and isoforms

    Perhaps, just a quick question. I am in the process of figuring out what type of sequencing runs to do for a de novo transcriptome assembly effort for a couple different species. Previously, we had planned to do everything with 454 data, but some friends have been trying to convince me that paired-end Illumina runs are the a better way to go. For me, it seems a hybrid approach would be the best, but my main concern with that is finding an appropriate tool to analyze that data. I know that MIRA has been cited by many on this site as being a great hybrid assembly assembler, but I wonder how it does with predicting isoforms and such. It seems one of more complicated bits in this type of assembly may be in combining different rules for isoform determination. Does anyone out there have any experience in using these data types to make a reference transcriptome or any thoughts on what might be the best tool for this type of assembly? Also, has anyone use MIRA to do this type of assembly and have some insight into how it does with identifying isoforms?

    Cheers,
    Nate

  • #2
    Mira does not handle well the isoforms at all. It is design for bacterial genomes and it assumes that your sample is haploid and that there are no isoforms.
    This has two consequences:
    - It treats ESTs from different isoforms as chimeric.
    - It treats SNPs as variation in duplicated genes.
    Despite that problem I'm still using it for assembling diploid transcriptomes because there are no better tools out there for sanger+454 projects. I've tried newbler 2.3 and the result was even worse in that regard. I don't know if the newer newbler could be any better, but since it's not trivial to get a copy of newbler I've not tried it.
    This is a serious problem for me and I'd like to find a good solution.
    Last edited by Jose Blanca; 12-08-2010, 11:39 PM.

    Comment


    • #3
      Thanks for your thoughts, Jose. I have obtained a copy of the Newbler 2.5 and am currently using it to do some assembly. I'll let you know how I think it works out once I get a better handle on the results. Did you see the recent paper by Kumar and Blaxter in BMC Genomics that compares assemblyers? I'd be curious what you thought of it given your experience with a couple of these other assemblers. If you'd had similar experiences or not. Also, just to clarify, you were saying that the two consequences that you found in MIRA were worse in Newbler 2.3. Was that correct?

      Comment


      • #4
        I've found that you have to run several rounds of MIRA otherwise it under-assembles transcriptome data. So I pass the contigs and 'debris' sequences from the first assembly into a second assembly and so on, each time keeping the assembly parameters the same. It seems to take 4-5 rounds of repeated assembly before MIRA stops finding new contigs. Then I pass the contigs to CAP3 for a final round of clustering. Note that this isn't a criticism of MIRA and I suspect that this under-clustering is just a consequence of the uneven read coverage and sequence diversity of transcriptomes. I think MIRA is an incredibly flexible and useful tool!

        Comment


        • #5
          Originally posted by JueFish View Post
          I'll let you know how I think it works out once I get a better handle on the results.
          That would be very useful for me, thanks.

          Originally posted by JueFish View Post
          Did you see the recent paper by Kumar and Blaxter in BMC Genomics that compares assemblyers? I'd be curious what you thought of it given your experience with a couple of these other assemblers. If you'd had similar experiences or not.
          Yes, I read the paper. Unfortunately their analysis is quite swallow. The metrics they used only have into account the number and length of contigs. They ignore these problem that we're discussing.

          Originally posted by JueFish View Post
          Also, just to clarify, you were saying that the two consequences that you found in MIRA were worse in Newbler 2.3. Was that correct?
          Yes, that was with 2.3, if I get a copy of a newer version I will trying out again.

          The reassembling approach but using CAP3 could be a good one. Unfortunatelly in that way CAP3 does not have information about the coverage. Also I had serious problems with CAP3, it tends to crash and it is not maintained anymore.

          Comment


          • #6
            MIRA parameters for 454 data and hybrid (454+sanger)

            Originally posted by kirby View Post
            I've found that you have to run several rounds of MIRA otherwise it under-assembles transcriptome data. So I pass the contigs and 'debris' sequences from the first assembly into a second assembly and so on, each time keeping the assembly parameters the same. It seems to take 4-5 rounds of repeated assembly before MIRA stops finding new contigs. Then I pass the contigs to CAP3 for a final round of clustering. Note that this isn't a criticism of MIRA and I suspect that this under-clustering is just a consequence of the uneven read coverage and sequence diversity of transcriptomes. I think MIRA is an incredibly flexible and useful tool!
            Dear can you pass the MIRA parameters for 454 assembly of transcriptome. when I assemble I found consensus quality of 48 and few Strong unresolved repeat positions (SRMc) while Consensus bases with IUPAC is quite high. should we check those concensus case?

            thanks

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Advancing Precision Medicine for Rare Diseases in Children
              by seqadmin




              Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
              12-16-2024, 07:57 AM
            • seqadmin
              Recent Advances in Sequencing Technologies
              by seqadmin



              Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

              Long-Read Sequencing
              Long-read sequencing has seen remarkable advancements,...
              12-02-2024, 01:49 PM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 12-17-2024, 10:28 AM
            0 responses
            33 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 12-13-2024, 08:24 AM
            0 responses
            48 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 12-12-2024, 07:41 AM
            0 responses
            34 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 12-11-2024, 07:45 AM
            0 responses
            46 views
            0 likes
            Last Post seqadmin  
            Working...
            X