Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Mask x number of bases WITHIN sequence prior to alignment

    Message moved to correct section
    http://seqanswers.com/forums/showthread.php?t=22014

    Hi all,

    As you may see from the picture I have this QC from all R2 reads of my Paired End sequenced samples. Due to a technical error during the sequencing I am ending up with 30+ R2 reads with serious errors in the middle of the sequence.

    Do you know any way to mask (or to allow mismatch at) a specific number of bases (2-3) at a specific position WITHIN the fragment length prior to alignment? Biostrings is an option that I would prefer not to use for reasons of speed.

    Can what you propose be selectively applied to only one of the two reads in the paired end samples?

    It would be ideal if this could be directly applied directly with Bowtie like the trimming left/right that already exists as an inherent option.



    Last edited by SEQond; 07-27-2012, 05:45 AM. Reason: moved to correct section

  • #2
    ELAND should be able to mask these bases during alignment, since the initial error-free segment is longer than the seed. Include USE_BASES Y41n3Y*n in the config file to mask bases 42-44 (plus the terminal base, which is the default). The one caveat is if the error results from fluidics/chemistry issues, then the phasing after the error may be incorrect.

    If this doesn't work or you prefer an alternative aligner, you could convert the reads into pseudo-paired end data. Use bases 1-41 as read one, then use bases 45-100 as read two. Filter the aligned data on expected criteria (i.e., both reads map to same chromosome and orientation, position of read 2 = read 1 + 44 [or + 41-44 if phasing is off]).

    Comment


    • #3
      Originally posted by HESmith View Post
      ELAND should be able to mask these bases during alignment, since the initial error-free segment is longer than the seed. Include USE_BASES Y41n3Y*n in the config file to mask bases 42-44 (plus the terminal base, which is the default). The one caveat is if the error results from fluidics/chemistry issues, then the phasing after the error may be incorrect.

      If this doesn't work or you prefer an alternative aligner, you could convert the reads into pseudo-paired end data. Use bases 1-41 as read one, then use bases 45-100 as read two. Filter the aligned data on expected criteria (i.e., both reads map to same chromosome and orientation, position of read 2 = read 1 + 44 [or + 41-44 if phasing is off]).
      Can ELAND in this way align Paired End sequences while at the same time masking selectively bases of only a one of the two reads?

      To be honest I would prefer a BW based aligner (Bowtie,BWA, and SOAP2)
      Thanks for your answer

      Comment


      • #4
        Possibly bowtie 2 is the answer to the issue

        also look here or here
        Last edited by SEQond; 08-13-2012, 08:18 AM.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM
        • seqadmin
          Techniques and Challenges in Conservation Genomics
          by seqadmin



          The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

          Avian Conservation
          Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
          03-08-2024, 10:41 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 06:37 PM
        0 responses
        10 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, Yesterday, 06:07 PM
        0 responses
        9 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-22-2024, 10:03 AM
        0 responses
        49 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-21-2024, 07:32 AM
        0 responses
        67 views
        0 likes
        Last Post seqadmin  
        Working...
        X