Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Expanding the coverage of an old assembly

    Hi,

    I sequenced the genome of an organism a few years ago and got an ok assembly (~100 contigs but many regions with N's).

    Recently, we re-sequenced the genome of the same organism and I would like to map the new reads to the old nucleotide contigs. With an increased insert length, I was hoping to also expand the contigs in an attempt to reduce their number.

    I used bowtie to map my reads to the old contig sequence but this didn't expand my contig lengths.

    Is there a way to do this?

  • #2
    BBMap will map reads well off the ends of contigs, into the N regions. Of course, you still have to pileup the mapped reads and generate consensus with a separate tool to expand the contigs; mappers won't do that.

    You could also try a scaffolding tool; some of them will fill captured gaps (in which one read maps to each contig). Improving assemblies with lots of contigs is basically what they exist for.

    Comment


    • #3
      I agree with Brian that what you probably want is a scaffolding program. A recent review was published in:

      Comment


      • #4
        SOAPdenovo’s gapfilling module is what you’re really looking for I think. After that is done, you could try using both your old data and new data to run another round of scaffolding, but if the insert sizes are about the same, you might not have much luck in actually improving the scaffold length.

        Also, if your new data is really this much better, you might be best served to just do a new assembly. I know that can mean redoing a lot of other work (repeat/gene annotations for example), but depending on how much other work you’re planning to do on this genome, it may make your life easier for years to come.

        Comment


        • #5
          I haven't use a gap closer before. There seem to be at least three: IMAGE, Soap's GapCloser, and GapFiller. I have a project somewhat similar to yours -- old data already in contigs/scaffolds with new data fresh off the sequencer to use as additional information. Unlike your project my original genome had way too many scaffolds to be useful (i.e., publishable) so I will probably be doing a fresh assembly. But I am exploring the gap closer programs and will report back if I find anything interesting.

          Comment


          • #6
            Thanks!

            I did plan on reassembling the new data but there seems to be contamination and this making a poor assembly. It is this reason that I wanted to use the old assembly to caputure the reads of my organism and then build on the contig length of the old assembly - while at the same time excluding the contaminating reads.

            Thanks for the info on the gapfillers and the article! I'll give it a go.

            Westerman - that would be great if you could let me know how the new assembly goes.

            cheers

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM
            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            31 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            32 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            28 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-04-2024, 09:00 AM
            0 responses
            53 views
            0 likes
            Last Post seqadmin  
            Working...
            X