Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Assembling pooled BACs from 454 data

    We have a 454 titanium run of ~50 pooled BACs. Not bar-coded. Not paired-end. Two clonal lines. Previously mostly unsequenced genome. Genome is undoubtedly repetitive. BACs could overlap.

    I am having trouble assembling the BACs. Newbler runs but then hangs in the 'deconvoluting step'. The TIGR EST clustering pipeline -- hey, I figured this was like an EST program only with bigger "ESTs" -- is throwing most of the reads into one contig even after masking out vector, adapters, etc. Of course ideally one would like to see 50 or so contigs which could then be assembled.

    Does anyone have any papers to read or ideas on how to extract these BACs from the 350 Mbase dataset? I guess that basically I need a good clustering method. After that the assembly itself should be simple.

    Thanks,
    -- Rick

  • #2
    How about dividing sequences into small groups?

    After assembling the each group and gathering the contigs, you can assemble the whole contigs one more time.

    Using more stringent criteria such as higher homology and longer mimium overlaps can be an another approach.

    Comment


    • #3
      Originally posted by mgenome View Post
      How about dividing sequences into small groups?

      After assembling the each group and gathering the contigs, you can assemble the whole contigs one more time.
      A good idea and one that I will try. If nothing else I might get to the repetitive parts of the BACs.


      Using more stringent criteria such as higher homology and longer mimium overlaps can be an another approach.
      Yes. I was running several of these clusters last night only to come back to work this morning and find that I was over my disk quota and that my programs crashed in mysterious ways. Who would have guessed that 250 GB would not be enough space.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM
      • seqadmin
        Strategies for Sequencing Challenging Samples
        by seqadmin


        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
        03-22-2024, 06:39 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      25 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      28 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      24 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-04-2024, 09:00 AM
      0 responses
      52 views
      0 likes
      Last Post seqadmin  
      Working...
      X