Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • 454 and homopolymers

    I am planning to use CORTEX to assemble 454 data and I want to use the cut-homopolymer option but I am having trouble deciding at what size homopolymers are a problem for 454. Seems to me like it would somewhere in the 4-6 base range, however I would like to disrupt my reads as little as possible.

    Does anyone has some insight into this? Has anyone looked at error rates as a function of homopolymer length in 454?

    Thanks,
    David

  • #2
    Hi there

    I wrote Cortex, so I know about that, but I have somewhat limited 454 experience.
    If you want to be systematic, you can

    1. load in your reads multiple times, and each time use a different homopolymer threshold, and use the --dump_filtered_readlen_distribution option to dump a file showing how it affects your read lengths. That tells you how much read-length you are throwing away

    2. Theres the issue of 454 homopolymer errors, and at what length they are prevalent. I can't help with that to be honest.

    If you are just making variant calls, as I was when I did this, I would use a limit of 3 and see how I go, but that's quite conservative.

    Comment


    • #3
      Originally posted by Zam View Post
      Hi there

      I wrote Cortex, so I know about that, but I have somewhat limited 454 experience.
      If you want to be systematic, you can

      1. load in your reads multiple times, and each time use a different homopolymer threshold, and use the --dump_filtered_readlen_distribution option to dump a file showing how it affects your read lengths. That tells you how much read-length you are throwing away

      2. Theres the issue of 454 homopolymer errors, and at what length they are prevalent. I can't help with that to be honest.

      If you are just making variant calls, as I was when I did this, I would use a limit of 3 and see how I go, but that's quite conservative.
      I wrote a script to do #1 and if you set the threshold to 3 or 4, it seems like a lot of reads get fragmented beyond use. I was hoping to balance the threshold with the desire to keep as many reads as possible. It is hard to do this, however, without knowing at what length the error rate explodes for homopolymers. I suppose I could determine this but I am hoping someone on seqanswers has some experience here.

      Comment


      • #4
        Fair enough - good luck!

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM
        • seqadmin
          Techniques and Challenges in Conservation Genomics
          by seqadmin



          The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

          Avian Conservation
          Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
          03-08-2024, 10:41 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 06:37 PM
        0 responses
        10 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, Yesterday, 06:07 PM
        0 responses
        9 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-22-2024, 10:03 AM
        0 responses
        49 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-21-2024, 07:32 AM
        0 responses
        67 views
        0 likes
        Last Post seqadmin  
        Working...
        X