Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Assembling De Novo 454 Transcriptome Contigs and Singletons with Illumina Short Reads

    Background: I have assembled 4 million None Normalized 454 Titanium de novo transcriptome Reads using Newbler 2.5, and have gotten amazing results, 88000 contigs with N50 of 951 bp, 35000 isotigs with N50 1500 longest being 7900, and 276564 singletons. Now I am about to sequence the same transcriptome using Illumina 75+ bp platfrom.
    Problem: I am contemplating the assembly software that I should be using.
    I really like newber's isoform prediction, and am not sure if it possible to merge both sets of raws reads together in an good assembly.
    I haven't seen any software out that are able to utilize a transcriptome reference for assembling new reads

  • #2
    I worked on Newbler on transcriptome data,while the concept of isotigs is good- 88k contigs with N50 951 is horrible.
    With Illumina data I would advise you to look at Oasis/velvet and scripture.

    Comment


    • #3
      I worked on Newbler on transcriptome data,while the concept of isotigs is good- 88k contigs with N50 951 is horrible.
      Actually, an N50 of 951 sounds very good to me if you're doing a de-novo transcript assembly of some reasonably complex eukaryote. Remember too, Newbler uses a different definition of contig than most other assemblers, and the isotig N50 is probably a better value to use when comparing to other tools.

      Comment


      • #4
        Try adding the Illumina reads to newbler 2.5! See (my blog): http://contig.wordpress.com/2011/01/...her-platforms/

        Comment


        • #5
          Thanks for the reply. From the look of it I might have to reassemble the raw reads from 454 and the new reads from Illumina using both Oasis/velvet and Newbler. I will compare the results between these two methods.

          Comment


          • #6
            Originally posted by Vickenstein View Post
            [...] Now I am about to sequence the same transcriptome using Illumina 75+ bp platfrom. [...]
            I am contemplating the assembly software that I should be using.
            I use the current development version of MIRA (V3.2.1.8) and just went through a RNASeq denovo 100bp with 22m reads.

            Originally posted by Vickenstein View Post
            I really like newber's isoform prediction, and am not sure if it possible to merge both sets of raws reads together in an good assembly.
            It is. I regularly use MIRA for genome de-novo with 454 and Illumina (ranging from 36 to 100mers). Should also work with mixed transcriptome.

            Originally posted by Vickenstein View Post
            I haven't seen any software out that are able to utilize a transcriptome reference for assembling new reads
            MIRA. No problem if your reference is a transcriptome, but stay away from trying to map RNASeq to a genome, that will fail miserably at intron/exon boundaries.

            B.

            Disclaimer 1: I'm the author of MIRA, your mileage may vary (but then I'd like to hear about it)
            Disclaimer 2: for data sets with more than 40m reads you probably want to wait for a next version.

            Comment


            • #7
              Originally posted by Vickenstein View Post
              Thanks for the reply. From the look of it I might have to reassemble the raw reads from 454 and the new reads from Illumina using both Oasis/velvet and Newbler. I will compare the results between these two methods.
              Depending on the library itself, I suspect that coverage issues might also prevent a "good assembly" with non-normalized data. I had a 3mio-titanium-reads-human-non-normalzied-cDNA dataset;one fourth of the whole library representing the same gene ... Newbler even failed to map and assemble this set (crashed during consensus calculation). Non-normalized libraries are not the best option to assemble (denovo) with NGS data .. but as I said, it depends on the library.

              Sven

              Comment


              • #8
                Originally posted by sklages View Post
                I had a 3mio-titanium-reads-human-non-normalzied-cDNA dataset;one fourth of the whole library representing the same gene ... Newbler even failed to map and assemble this set (crashed during consensus calculation).
                *cough* 750k times the same gene? I would not expect any program to really assemble that de-novo if not specially primed. In mapping too, the coverage might be somewhat on the unexpected side.

                B.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM
                • seqadmin
                  Techniques and Challenges in Conservation Genomics
                  by seqadmin



                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                  Avian Conservation
                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                  03-08-2024, 10:41 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Yesterday, 06:37 PM
                0 responses
                7 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, Yesterday, 06:07 PM
                0 responses
                7 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-22-2024, 10:03 AM
                0 responses
                49 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-21-2024, 07:32 AM
                0 responses
                66 views
                0 likes
                Last Post seqadmin  
                Working...
                X