Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Vickenstein
    Junior Member
    • Mar 2011
    • 2

    Assembling De Novo 454 Transcriptome Contigs and Singletons with Illumina Short Reads

    Background: I have assembled 4 million None Normalized 454 Titanium de novo transcriptome Reads using Newbler 2.5, and have gotten amazing results, 88000 contigs with N50 of 951 bp, 35000 isotigs with N50 1500 longest being 7900, and 276564 singletons. Now I am about to sequence the same transcriptome using Illumina 75+ bp platfrom.
    Problem: I am contemplating the assembly software that I should be using.
    I really like newber's isoform prediction, and am not sure if it possible to merge both sets of raws reads together in an good assembly.
    I haven't seen any software out that are able to utilize a transcriptome reference for assembling new reads
  • aparna
    Member
    • Feb 2009
    • 17

    #2
    I worked on Newbler on transcriptome data,while the concept of isotigs is good- 88k contigs with N50 951 is horrible.
    With Illumina data I would advise you to look at Oasis/velvet and scripture.

    Comment

    • cram
      Member
      • Nov 2008
      • 16

      #3
      I worked on Newbler on transcriptome data,while the concept of isotigs is good- 88k contigs with N50 951 is horrible.
      Actually, an N50 of 951 sounds very good to me if you're doing a de-novo transcript assembly of some reasonably complex eukaryote. Remember too, Newbler uses a different definition of contig than most other assemblers, and the isotig N50 is probably a better value to use when comparing to other tools.

      Comment

      • flxlex
        Moderator
        • Nov 2008
        • 412

        #4
        Try adding the Illumina reads to newbler 2.5! See (my blog): http://contig.wordpress.com/2011/01/...her-platforms/

        Comment

        • Vickenstein
          Junior Member
          • Mar 2011
          • 2

          #5
          Thanks for the reply. From the look of it I might have to reassemble the raw reads from 454 and the new reads from Illumina using both Oasis/velvet and Newbler. I will compare the results between these two methods.

          Comment

          • BaCh
            Member
            • May 2008
            • 81

            #6
            Originally posted by Vickenstein View Post
            [...] Now I am about to sequence the same transcriptome using Illumina 75+ bp platfrom. [...]
            I am contemplating the assembly software that I should be using.
            I use the current development version of MIRA (V3.2.1.8) and just went through a RNASeq denovo 100bp with 22m reads.

            Originally posted by Vickenstein View Post
            I really like newber's isoform prediction, and am not sure if it possible to merge both sets of raws reads together in an good assembly.
            It is. I regularly use MIRA for genome de-novo with 454 and Illumina (ranging from 36 to 100mers). Should also work with mixed transcriptome.

            Originally posted by Vickenstein View Post
            I haven't seen any software out that are able to utilize a transcriptome reference for assembling new reads
            MIRA. No problem if your reference is a transcriptome, but stay away from trying to map RNASeq to a genome, that will fail miserably at intron/exon boundaries.

            B.

            Disclaimer 1: I'm the author of MIRA, your mileage may vary (but then I'd like to hear about it)
            Disclaimer 2: for data sets with more than 40m reads you probably want to wait for a next version.

            Comment

            • sklages
              Senior Member
              • May 2008
              • 628

              #7
              Originally posted by Vickenstein View Post
              Thanks for the reply. From the look of it I might have to reassemble the raw reads from 454 and the new reads from Illumina using both Oasis/velvet and Newbler. I will compare the results between these two methods.
              Depending on the library itself, I suspect that coverage issues might also prevent a "good assembly" with non-normalized data. I had a 3mio-titanium-reads-human-non-normalzied-cDNA dataset;one fourth of the whole library representing the same gene ... Newbler even failed to map and assemble this set (crashed during consensus calculation). Non-normalized libraries are not the best option to assemble (denovo) with NGS data .. but as I said, it depends on the library.

              Sven

              Comment

              • BaCh
                Member
                • May 2008
                • 81

                #8
                Originally posted by sklages View Post
                I had a 3mio-titanium-reads-human-non-normalzied-cDNA dataset;one fourth of the whole library representing the same gene ... Newbler even failed to map and assemble this set (crashed during consensus calculation).
                *cough* 750k times the same gene? I would not expect any program to really assemble that de-novo if not specially primed. In mapping too, the coverage might be somewhat on the unexpected side.

                B.

                Comment

                Latest Articles

                Collapse

                • SEQadmin2
                  Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                  by SEQadmin2


                  I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                  Here are nine questions we think about, in roughly the order they matter, before...
                  06-18-2026, 07:11 AM
                • SEQadmin2
                  From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                  by SEQadmin2


                  Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                  The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                  ...
                  06-02-2026, 10:05 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by SEQadmin2, 06-26-2026, 11:10 AM
                0 responses
                15 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-17-2026, 06:09 AM
                0 responses
                49 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-09-2026, 11:58 AM
                0 responses
                107 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-05-2026, 10:09 AM
                0 responses
                125 views
                0 reactions
                Last Post SEQadmin2  
                Working...