Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • A Problem with Short Reads in Newbler

    Hi all,

    Just looking for some help with performing de novo assemblies with Newbler.

    We've sequenced our genome using Illumina and have a 200 bp paired end library. I want to perform a de novo assembly with the data using Newbler however it fails to assemble.

    I'm uploading both .fastq files, Newbler will index them, acknowledge that I have selected pair-end and then "completes" the assembly but doesn't generate any contigs. The read status error says the reads are too short.

    Given the read lengths are 30 bp they should be sufficient, though I think the preferred is 50. I've tried changing the parameters to compensate for 30 bp read lengths but with no luck.

    Any feedback would be greatly appreciated.

    Alison

  • #2
    Why do you want to use Newbler? As the in-house assembler from Roche this is not a natural choice for Illumina data, except for when you are combining Illumina reads with 454 data. Even if it does work, it's likely suboptimal. I'd try a short read assembler instead, any of the ones listed here are probably reasonable choices, except ALLPATHS-LG which isn't compatible with your dataset: http://gage.cbcb.umd.edu/assemblers/index.html
    Last edited by nickloman; 01-21-2012, 09:13 AM.

    Comment


    • #3
      Also, prepare to be disappointed with the results of a de novo assembly generated with 30bp reads, even if paired.

      Comment


      • #4
        I've got 454 data that is being used as a reference for 3 other genomes that were sequenced using Illumina. We have Illumina data for the reference genome as well and I've tried doing the assembly by combining this with the 454 and it still gives me the same error. Doing the assembly with the Illumina and 454 data for the reference would be handy but currently my priority is getting an assembly for the Illumina data sets (or at least one of them). I'd like to use just one assembler for all the data which is why I wanted to use Newbler.

        I'll have a look at one of the ones listed as well. It'll give me something at least.

        Thanks for the information.

        Comment


        • #5
          According to Lex Nederbragt's excellent blog, Newbler has a minimum length for reads of 50bp and a default minimum overlap of 40bp, which is probably the reason you don't have any luck combining the data. For hybrid Illumina/454 assemblies it might be worth reducing those values so the short reads are used (see http://contig.wordpress.com/2011/04/...-read-contigs/).

          If you just want to use the Illumina reads, a de Bruijn graph based short-read assembler will likely perform much better (the overlap-layout-consensus method used by Newbler isn't tailored for large numbers of very short reads).

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM
          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          30 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          32 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          28 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          52 views
          0 likes
          Last Post seqadmin  
          Working...
          X