Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Vickenstein
    Junior Member
    • Mar 2011
    • 2

    Assembling De Novo 454 Transcriptome Contigs and Singletons with Illumina Short Reads

    Background: I have assembled 4 million None Normalized 454 Titanium de novo transcriptome Reads using Newbler 2.5, and have gotten amazing results, 88000 contigs with N50 of 951 bp, 35000 isotigs with N50 1500 longest being 7900, and 276564 singletons. Now I am about to sequence the same transcriptome using Illumina 75+ bp platfrom.
    Problem: I am contemplating the assembly software that I should be using.
    I really like newber's isoform prediction, and am not sure if it possible to merge both sets of raws reads together in an good assembly.
    I haven't seen any software out that are able to utilize a transcriptome reference for assembling new reads
  • aparna
    Member
    • Feb 2009
    • 17

    #2
    I worked on Newbler on transcriptome data,while the concept of isotigs is good- 88k contigs with N50 951 is horrible.
    With Illumina data I would advise you to look at Oasis/velvet and scripture.

    Comment

    • cram
      Member
      • Nov 2008
      • 16

      #3
      I worked on Newbler on transcriptome data,while the concept of isotigs is good- 88k contigs with N50 951 is horrible.
      Actually, an N50 of 951 sounds very good to me if you're doing a de-novo transcript assembly of some reasonably complex eukaryote. Remember too, Newbler uses a different definition of contig than most other assemblers, and the isotig N50 is probably a better value to use when comparing to other tools.

      Comment

      • flxlex
        Moderator
        • Nov 2008
        • 412

        #4
        Try adding the Illumina reads to newbler 2.5! See (my blog): http://contig.wordpress.com/2011/01/...her-platforms/

        Comment

        • Vickenstein
          Junior Member
          • Mar 2011
          • 2

          #5
          Thanks for the reply. From the look of it I might have to reassemble the raw reads from 454 and the new reads from Illumina using both Oasis/velvet and Newbler. I will compare the results between these two methods.

          Comment

          • BaCh
            Member
            • May 2008
            • 81

            #6
            Originally posted by Vickenstein View Post
            [...] Now I am about to sequence the same transcriptome using Illumina 75+ bp platfrom. [...]
            I am contemplating the assembly software that I should be using.
            I use the current development version of MIRA (V3.2.1.8) and just went through a RNASeq denovo 100bp with 22m reads.

            Originally posted by Vickenstein View Post
            I really like newber's isoform prediction, and am not sure if it possible to merge both sets of raws reads together in an good assembly.
            It is. I regularly use MIRA for genome de-novo with 454 and Illumina (ranging from 36 to 100mers). Should also work with mixed transcriptome.

            Originally posted by Vickenstein View Post
            I haven't seen any software out that are able to utilize a transcriptome reference for assembling new reads
            MIRA. No problem if your reference is a transcriptome, but stay away from trying to map RNASeq to a genome, that will fail miserably at intron/exon boundaries.

            B.

            Disclaimer 1: I'm the author of MIRA, your mileage may vary (but then I'd like to hear about it)
            Disclaimer 2: for data sets with more than 40m reads you probably want to wait for a next version.

            Comment

            • sklages
              Senior Member
              • May 2008
              • 628

              #7
              Originally posted by Vickenstein View Post
              Thanks for the reply. From the look of it I might have to reassemble the raw reads from 454 and the new reads from Illumina using both Oasis/velvet and Newbler. I will compare the results between these two methods.
              Depending on the library itself, I suspect that coverage issues might also prevent a "good assembly" with non-normalized data. I had a 3mio-titanium-reads-human-non-normalzied-cDNA dataset;one fourth of the whole library representing the same gene ... Newbler even failed to map and assemble this set (crashed during consensus calculation). Non-normalized libraries are not the best option to assemble (denovo) with NGS data .. but as I said, it depends on the library.

              Sven

              Comment

              • BaCh
                Member
                • May 2008
                • 81

                #8
                Originally posted by sklages View Post
                I had a 3mio-titanium-reads-human-non-normalzied-cDNA dataset;one fourth of the whole library representing the same gene ... Newbler even failed to map and assemble this set (crashed during consensus calculation).
                *cough* 750k times the same gene? I would not expect any program to really assemble that de-novo if not specially primed. In mapping too, the coverage might be somewhat on the unexpected side.

                B.

                Comment

                Latest Articles

                Collapse

                • SEQadmin2
                  From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                  by SEQadmin2


                  Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                  The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                  ...
                  Yesterday, 10:05 AM
                • SEQadmin2
                  Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                  by SEQadmin2


                  With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                  Introduction

                  Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                  05-22-2026, 06:42 AM
                • SEQadmin2
                  Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                  by SEQadmin2

                  Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                  Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                  05-06-2026, 09:04 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by SEQadmin2, Yesterday, 12:03 PM
                0 responses
                19 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, Yesterday, 11:40 AM
                0 responses
                14 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 05-28-2026, 11:40 AM
                0 responses
                29 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 05-26-2026, 10:12 AM
                0 responses
                31 views
                0 reactions
                Last Post SEQadmin2  
                Working...