Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Allowing a high number of mismatches when mapping

    Dear all,

    I have sequences of 53bp, among which between 23 and 30 bases are of interest (=motifs). For simplicity, I took only the first 17 bases. Each sample has between 5 and 23 millions of reads.
    The reference is composed of 7450 distinct sequences. I took the 17 first bases of the reference sequences for simplicity.
    My goal is to map the motifs to the reference.

    If there was no sequencing error, I would find only 7450 distinct motifs in my samples. There was a problem during the sequencing most likely and 25% of the reads have poor quality.
    When mapping with bowtie
    Code:
    bowtie --best --strata -v 2 -k 1 -m 1 --norc
    the mapping rate is ~ 70-82%.
    I used -v 3 on two samples, and it increases the mapping rate of ~ 1.5% only.

    Since my reference is small (7450 distinct sequences), I know that with less than 17 bases (sometimes 6 bases are sufficient), I can uniquely identify from which of the 7450 references the sequence comes. Thus, I need to allow for this specific case a higher number of mismatches (bowtie is limited to 3).

    I intend to try bowtie2 in local mode. I do not know it, but RMAP (https://omictools.com/rmap-tool) seems to correspond to my question.

    Could you please give me some suggestions/ideas to deal with this particular case?
    Thank you a lot for your help.

  • #2
    I suggest you try BBMap, which is quite tolerant of low identity; it typically allows mapping down to around 60-70% identity. For very high sensitivity, try this command:

    Code:
    bbmap.sh in=reads.fq out=mapped.sam vslow minid=0.6 maxindel=5 k=11
    Using only the first 17 bp of sequences will hurt the ability to map with BBMap, though; you need to use the full sequences.

    Comment


    • #3
      Thank you Brian for your suggestion.
      I am doing some tests with Bowtie on shorter sequences and if it doesn't work, I will try BBMap. The maximum length I can use is 23 bp. Would it be sufficient?
      Last edited by Jane M; 07-25-2016, 01:08 AM.

      Comment


      • #4
        23 is fine, but more bases will always increase specificity. If your sequences are 53 bp, why are you cutting them down to 23?

        Comment


        • #5
          Thank you for your answer.

          I am working on a sh screen. The first 22-30 bases are common to all sequences. Between 23 and 31 bases correspond to the sh in each sequence.
          Since there is a problem of quality at the end (from the middle in fact), I use the minimum number of bases (from the left) needed to discriminate the sh.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM
          • seqadmin
            Techniques and Challenges in Conservation Genomics
            by seqadmin



            The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

            Avian Conservation
            Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
            03-08-2024, 10:41 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Yesterday, 06:37 PM
          0 responses
          8 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, Yesterday, 06:07 PM
          0 responses
          8 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-22-2024, 10:03 AM
          0 responses
          49 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-21-2024, 07:32 AM
          0 responses
          66 views
          0 likes
          Last Post seqadmin  
          Working...
          X