Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to do unique mapping in bowtie2

    Bowtie1 has a handy parameter -m that suppresses all alignments if more than one hit is found.
    But it seems that bowtie2 doesn't have an equivalent parameter. Instead, what it does is to report one alignment when there are mutli-hits, but this is not the strict definition of "unique" mapping.
    Did I miss anything?

  • #2
    Bowtie2 is sort of like bowtie1 with the -k option. If you want unique hits (which I take to mean alignments where the next best alignment isn't as good), then just keep any alignment with a MAPQ > 1. bowtie2 will give reads with more than one equally good alignment a score of 1 or 0, depending on whether there are any mismatches or not.

    Comment


    • #3
      Originally posted by dpryan View Post
      Bowtie2 is sort of like bowtie1 with the -k option. If you want unique hits (which I take to mean alignments where the next best alignment isn't as good), then just keep any alignment with a MAPQ > 1. bowtie2 will give reads with more than one equally good alignment a score of 1 or 0, depending on whether there are any mismatches or not.
      Thanks for replying. I think what bowtie1 -m 1 does is discarding a read if it can be matched to more than one location. But in bowtie2, -k 1 means it only searches for up to 1 hit. I think they are different.
      Do you mean from the bowtie2 (with -k 1) output, discard the alignment with MAPQ<=1? That's doable, but it would be more convenient if bowtie2 has this option by itself.
      Last edited by metheuse; 11-05-2013, 01:41 PM.

      Comment


      • #4
        Originally posted by metheuse View Post
        Thanks for replying. I think what bowtie1 -m 1 does is discarding a read if it can be matched to more than one location. But in bowtie2, -k 1 means it only searches for up to 1 hit. I think they are different.
        If you really want an "-m 1" equivalent, then just remove any reads with an "XS" auxiliary flag. It's rather odd to do that rather than just filter based on MAPQ, though.

        Comment


        • #5
          Originally posted by metheuse View Post
          Do you mean from the bowtie2 (with -k 1) output, discard the alignment with MAPQ<=1? That's doable, but it would be more convenient if bowtie2 has this option by itself.
          No, this will work with the standard Bowtie2 options, regardless of how many alignments per read are done (including one). The filtering happens at the next step in the pipeline. FWIW, samtools view allows you to filter on MAPQ:

          Code:
          bowtie2 -x <index> {-1 <leftReads> -2 <rightReads> | -U <allReads>} | \
            samtools view -S [b]-q 2[/b]

          Comment


          • #6
            Originally posted by dpryan View Post
            If you really want an "-m 1" equivalent, then just remove any reads with an "XS" auxiliary flag. It's rather odd to do that rather than just filter based on MAPQ, though.
            Thanks. Yes, I understand filtering by MAPQ is a good idea.
            Where can I find the meaning of different MAPQ values? It seems that bowtie2 (or just tophat2?) changed the scales of it once. I always feel uncertain about this, but couldn't find an explicit one-to-one table.

            Comment


            • #7
              Originally posted by gringer View Post
              No, this will work with the standard Bowtie2 options, regardless of how many alignments per read are done (including one). The filtering happens at the next step in the pipeline. FWIW, samtools view allows you to filter on MAPQ:

              Code:
              bowtie2 -x <index> {-1 <leftReads> -2 <rightReads> | -U <allReads>} | \
                samtools view -S [b]-q 2[/b]
              Thanks. In this post someone explains the meaning of each MAPQ value: http://seqanswers.com/forums/showthr...highlight=mapq
              But perhaps it's not up-to-date. Is there an "official" place that explains the meaning of MAPQ values like this?
              255 = unique mapping

              3 = maps to 2 locations in the target

              2 = maps to 3 locations

              1 = maps to 4-9 locations

              0 = maps to 10 or more locations.

              Does MAPQ > 1 mean unique mapping? Or is there a more stringent threshold?

              Comment


              • #8
                Originally posted by metheuse View Post
                Thanks. Yes, I understand filtering by MAPQ is a good idea.
                Where can I find the meaning of different MAPQ values? It seems that bowtie2 (or just tophat2?) changed the scales of it once. I always feel uncertain about this, but couldn't find an explicit one-to-one table.
                For tophat2, I recall the only MAPQ values you'll ever see are 0, 1, 2, 3 and 50 (previously, 255). 50 (previously 255) means a unique hit, while 3 means 2 equal hits, 2 means 3 equal hits, etc. For bowtie2, there's no simple explanation. Somewhere on here (or maybe biostars), I've posted the C version of bowtie2's MAPQ calculator that I use in bison (a bisulfite aligner for compute clusters). You can probably search around for that if you really want the gorey details (otherwise, it's in the source for bison on sourceforge).

                Comment


                • #9
                  I should qualify the bowtie2 part of my most recent reply. In bowtie2 a MAPQ>1 means a unique hit (i.e., the reported hit is better than the next best hit). Beyond that, the actual calculation of the MAPQ is pretty messy.

                  Comment


                  • #10
                    Originally posted by dpryan View Post
                    I should qualify the bowtie2 part of my most recent reply. In bowtie2 a MAPQ>1 means a unique hit (i.e., the reported hit is better than the next best hit). Beyond that, the actual calculation of the MAPQ is pretty messy.
                    Thanks! Sorry, I didn't mean to ignore your previous reply. I was just wondering if there is a more complete resource or something. But that's fine! I appreciate your answer!

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Strategies for Sequencing Challenging Samples
                      by seqadmin


                      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                      03-22-2024, 06:39 AM
                    • seqadmin
                      Techniques and Challenges in Conservation Genomics
                      by seqadmin



                      The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                      Avian Conservation
                      Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                      03-08-2024, 10:41 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, Yesterday, 06:37 PM
                    0 responses
                    11 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, Yesterday, 06:07 PM
                    0 responses
                    10 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 03-22-2024, 10:03 AM
                    0 responses
                    51 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 03-21-2024, 07:32 AM
                    0 responses
                    68 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X