Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • khb
    Member
    • Dec 2010
    • 15

    Bowtie parameters

    I'm new in next generation sequencing, and I use Bowtie for the first time. Is there someone that know how to get only the sequences that only have one match? I tried this command.
    bowtie hg19 -q input.fastq -m 1 --best --strata --all -S bowtie_out. I also tried without the --all command but both gave equal numbers of hits.
    Thanks
  • xinwu
    Member
    • Jul 2010
    • 33

    #2
    Originally posted by khb View Post
    I'm new in next generation sequencing, and I use Bowtie for the first time. Is there someone that know how to get only the sequences that only have one match? I tried this command.
    bowtie hg19 -q input.fastq -m 1 --best --strata --all -S bowtie_out. I also tried without the --all command but both gave equal numbers of hits.
    Thanks
    IMO, all option only means all valid alignments ACCORDING TO YOUR CRITERIA. Since you set m option with 1, all option does nothing here. If you set m with 3, and without all option, reads with 2 alignments will only report one (k is 1 by default) alignment, if with all option, all alignments will be reported (2 alignments in this case).
    Last edited by xinwu; 12-16-2010, 01:33 AM.

    Comment

    • khb
      Member
      • Dec 2010
      • 15

      #3
      So it will work without the m 1 command?

      Comment

      • fkrueger
        Senior Member
        • Sep 2009
        • 627

        #4
        Indeed, -m 1 will remove all alignments with more than 1 valid alignment meaning --best --strata --all won't have any effect on the alignment results.

        Comment

        • xinwu
          Member
          • Jul 2010
          • 33

          #5
          fkrueger is right. Other options make no sense since you set m option with 1.

          Comment

          • xinwu
            Member
            • Jul 2010
            • 33

            #6
            Originally posted by khb View Post
            So it will work without the m 1 command?
            As you said, you want 'unique' alignment rather than all aliginment, just use m with 1 to achieve this goal.

            Comment

            • khb
              Member
              • Dec 2010
              • 15

              #7
              So the conclusion is to use this parameters:
              bowtie hg19 -q input.fastq -m 1 -S bowtie_out
              but isn't this the same as
              bowtie hg19 -q input.fastq -m 1 --best --strata --all -S bowtie_out if the m 1 would overwrite the other parameters?

              Comment

              • fkrueger
                Senior Member
                • Sep 2009
                • 627

                #8
                They should do exactly the same thing, yes (as you mentioned yourself omitting --all does not change anything). If you want only absolutely unique sequences, -m 1 is the way to go.

                However as you do not specify any other options, bowtie will by default use a seed length of 28 bp and allow 2 mismatches in the seed, plus allow more mismatches after that. Depending on your read length you might want to chose somewhat more stringent mapping parameters so that -m 1 does not remove too many reads (such as -m 1 -l 36 -n 1 or similar).

                Best wishes

                Comment

                • khb
                  Member
                  • Dec 2010
                  • 15

                  #9
                  Thanks


                  I think it's strange ; it doesn't seem like it is unique sequences.

                  The read length is 50 bp. Do you think I should have changed the other parameters then? I should use the parameters best and strata?

                  Comment

                  • fkrueger
                    Senior Member
                    • Sep 2009
                    • 627

                    #10
                    This depends a bit on your application. If you just want to look at mapped positions for peak calling or something similar it is probably not necessary to remove everthing that has another match somewhere else in the genome, albeit with one or a few mismatches. for example you might encounter the case that you get a perfect 50bp match for your sequence of interest, and another match somewhere else which has say 3 mismatches. Using --best you could report the best alignment, however -m 1 would remove the sequence completely as it has more than one valid alignment (even though one has no mismatches and the other one has got 3 mismatches). Reporting the best sequence or a few of them (with the -k <int> option) will probably require some extra filtering afterwards, whereas -m 1 is a quick and safe option for absolutely unique matches.

                    Just try a few parameters and look at the alignment stats, sequences removed due to -m and so on until you are happy with the outcome.

                    Comment

                    • dariober
                      Senior Member
                      • May 2010
                      • 311

                      #11
                      Consider also that if have a read that has 1 match with 0 errors plus, say, 3 additional matches with 1 error than:
                      "-m 1 --best --strata" will report the match with 0 errors (because the match is unique in the best error stratum)
                      While:
                      "-m 1" or "-m 1 --best" will suppress all the alignments and report the read as unmapped.

                      So "-m 1" without "--best --strata" gives the strongest guarantee that a match is unique, although in my opinion it is too conservative.

                      Please correct me if I'm wrong...

                      Dario

                      Comment

                      • xinwu
                        Member
                        • Jul 2010
                        • 33

                        #12
                        Originally posted by dariober View Post
                        Consider also that if have a read that has 1 match with 0 errors plus, say, 3 additional matches with 1 error than:
                        "-m 1 --best --strata" will report the match with 0 errors (because the match is unique in the best error stratum)
                        While:
                        "-m 1" or "-m 1 --best" will suppress all the alignments and report the read as unmapped.

                        So "-m 1" without "--best --strata" gives the strongest guarantee that a match is unique, although in my opinion it is too conservative.

                        Please correct me if I'm wrong...

                        Dario
                        As the manual said "A stronger form of uniqueness is enforced when -m is specified but --best and --strata are not.", your conclusion is right. It is interesting when you play with "strata". In your case, it is not a real "unique" read since it was mapped to 3 locations, but when you add "best and strata" option, bowtie only looked at the "best basket" and found only one alignment (0 error), then bowtie thought it followed the constraint of "-m 1", and finally reported this alignment in the output. So, for common sense, if you want "unique" read, you should not add "best" and "strata" options with "-m 1" since it will distort the view of bowtie for "valid" alignments.
                        I dislike the concept of "strata", it is not flexible at all comparing to map/alignment quality. I wonder why bowtie can not output something like "map quality".
                        One more thing is "unique" read depends on your criteria, the number of mismatch you set also takes an effect on that. In your case, if you set number of mismatch to 0 and m to 1, the latter two alignments are not valid at all, bowtie will report it as a "unique" read; if you set it to 1 and m to 1, without "best and strata" option, bowtie will not report it at all, of course will not think it as a "unique" read also.
                        Last edited by xinwu; 12-16-2010, 10:27 PM.

                        Comment

                        Latest Articles

                        Collapse

                        • SEQadmin2
                          Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                          by SEQadmin2


                          I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.


                          Here are nine questions we think about, in roughly the order they matter, before...
                          06-18-2026, 07:11 AM
                        • SEQadmin2
                          From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                          by SEQadmin2


                          Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                          The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                          ...
                          06-02-2026, 10:05 AM

                        ad_right_rmr

                        Collapse

                        News

                        Collapse

                        Topics Statistics Last Post
                        Started by SEQadmin2, 06-17-2026, 06:09 AM
                        0 responses
                        25 views
                        0 reactions
                        Last Post SEQadmin2  
                        Started by SEQadmin2, 06-09-2026, 11:58 AM
                        0 responses
                        42 views
                        0 reactions
                        Last Post SEQadmin2  
                        Started by SEQadmin2, 06-05-2026, 10:09 AM
                        0 responses
                        48 views
                        0 reactions
                        Last Post SEQadmin2  
                        Started by SEQadmin2, 06-04-2026, 08:59 AM
                        0 responses
                        49 views
                        0 reactions
                        Last Post SEQadmin2  
                        Working...