Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • max mismatches in Bowtie2

    Hello,

    Does anyone know how to set the parameters to align the reads with no more than 2 mismatches in bowtie2?

    In Bowtie, the command line (-v is the parameter) is like the following:
    >Bowtie ref -a -v 2 -f read.fa output.sam

    How to record all the reads with no more than 2 mismatches in Bowtie2?

    Thanks,
    Yanju

  • #2
    I haven't actually *run* it yet, but I'm talking about bowtie 2 at journal club and I don't think this is actually possible at this point. You'd have to do it by filtering the sam file, I think. MD tag?

    The -N parameter controls the number of mismatches allowed per seed, but now we have overlapping seeds spaced at intervals.

    Comment


    • #3
      I've been working with it this week. You can't set it as a parameter. The cutoff is based on a minimum score threshold and the score is a function of the number of matches and gaps, and their associated penalties.

      Comment


      • #4
        Oh, also look at the SAM optional field XM:i<N> which tells you the number of mismatches. (XO and XG tell number of gap opens and gap extensions, and NM is the edit distance).

        Comment


        • #5
          Originally posted by hollandorange View Post
          Hello,

          Does anyone know how to set the parameters to align the reads with no more than 2 mismatches in bowtie2?

          In Bowtie, the command line (-v is the parameter) is like the following:
          >Bowtie ref -a -v 2 -f read.fa output.sam

          How to record all the reads with no more than 2 mismatches in Bowtie2?

          Thanks,
          Yanju
          hi,
          had your problem been solved. but now i meet the same problem, could you please tell me how you extract the reads no more than 2 mismatch?
          thanks a lot.

          Comment


          • #6
            I think you could do something by filtering on the field XM:i:0 and XM:i:1 and XM:i:2 from the sam file.

            Probably something like:

            samtools view | cut whatever column it is | grep "XM:i:0" > zero_mismatch.sam

            and then do that for XM:i:1 and XM:i:2, then combine?

            Comment


            • #7
              Originally posted by mgogol View Post
              Code:
              samtools view | cut whatever column it is | grep "XM:i:0" > zero_mismatch.sam
              I'm pretty sure that the cut in there will mean that only that column is included in the output. The problem is also trickier because the optional fields are tab separated and not necessarily always in the same column. However, if you don't care about the string 'XM:i:X' appearing in the read name, then a regular expression filter should still work fine:

              Code:
              samtools view -Sh - | grep -e "^@" -e "XM:i:[012][^0-9]" > low_mismatch.sam

              Comment


              • #8
                Thanks for improving on my hasty and incorrect answer...

                Comment


                • #9
                  thanks a lot. from the answer, I think out a another solution using perl. the code is :
                  perl -ne "print if /XM:i:[0-2]/;" raw.sam >cleaned.sam

                  Comment


                  • #10
                    Originally posted by mihuzx View Post
                    thanks a lot. from the answer, I think out a another solution using perl. the code is :
                    Code:
                    perl -ne "print if /XM:i:[0-2]/;" raw.sam >cleaned.sam
                    You missed out the headers and haven't considered >9 mismatches (unlikely, but it could happen). The perl equivalent (using your syntax) of what I wrote is as follows:

                    Code:
                    perl -ne "print if((/XM:i:[0-2][^0-9]/) || (/^@/));" raw.sam >cleaned.sam
                    But if you're always going to use that filter you might as well just pipe straight from bowtie2 without making the intermediate 'raw.sam' file, as in my previous example.
                    Last edited by gringer; 11-15-2013, 01:17 PM.

                    Comment


                    • #11
                      thanks for your quick and well-thought answer.
                      I was always thinking about how to get the low_mismatch.sam directly,but failed. now I know there so many things to lean for me.
                      thanks again for your guidance.

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Current Approaches to Protein Sequencing
                        by seqadmin


                        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                        04-04-2024, 04:25 PM
                      • seqadmin
                        Strategies for Sequencing Challenging Samples
                        by seqadmin


                        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                        03-22-2024, 06:39 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, 04-11-2024, 12:08 PM
                      0 responses
                      27 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-10-2024, 10:19 PM
                      0 responses
                      31 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-10-2024, 09:21 AM
                      0 responses
                      27 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-04-2024, 09:00 AM
                      0 responses
                      52 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X