Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bowtie parameters

    I'm new in next generation sequencing, and I use Bowtie for the first time. Is there someone that know how to get only the sequences that only have one match? I tried this command.
    bowtie hg19 -q input.fastq -m 1 --best --strata --all -S bowtie_out. I also tried without the --all command but both gave equal numbers of hits.
    Thanks

  • #2
    Originally posted by khb View Post
    I'm new in next generation sequencing, and I use Bowtie for the first time. Is there someone that know how to get only the sequences that only have one match? I tried this command.
    bowtie hg19 -q input.fastq -m 1 --best --strata --all -S bowtie_out. I also tried without the --all command but both gave equal numbers of hits.
    Thanks
    IMO, all option only means all valid alignments ACCORDING TO YOUR CRITERIA. Since you set m option with 1, all option does nothing here. If you set m with 3, and without all option, reads with 2 alignments will only report one (k is 1 by default) alignment, if with all option, all alignments will be reported (2 alignments in this case).
    Last edited by xinwu; 12-16-2010, 01:33 AM.

    Comment


    • #3
      So it will work without the m 1 command?

      Comment


      • #4
        Indeed, -m 1 will remove all alignments with more than 1 valid alignment meaning --best --strata --all won't have any effect on the alignment results.

        Comment


        • #5
          fkrueger is right. Other options make no sense since you set m option with 1.

          Comment


          • #6
            Originally posted by khb View Post
            So it will work without the m 1 command?
            As you said, you want 'unique' alignment rather than all aliginment, just use m with 1 to achieve this goal.

            Comment


            • #7
              So the conclusion is to use this parameters:
              bowtie hg19 -q input.fastq -m 1 -S bowtie_out
              but isn't this the same as
              bowtie hg19 -q input.fastq -m 1 --best --strata --all -S bowtie_out if the m 1 would overwrite the other parameters?

              Comment


              • #8
                They should do exactly the same thing, yes (as you mentioned yourself omitting --all does not change anything). If you want only absolutely unique sequences, -m 1 is the way to go.

                However as you do not specify any other options, bowtie will by default use a seed length of 28 bp and allow 2 mismatches in the seed, plus allow more mismatches after that. Depending on your read length you might want to chose somewhat more stringent mapping parameters so that -m 1 does not remove too many reads (such as -m 1 -l 36 -n 1 or similar).

                Best wishes

                Comment


                • #9
                  Thanks


                  I think it's strange ; it doesn't seem like it is unique sequences.

                  The read length is 50 bp. Do you think I should have changed the other parameters then? I should use the parameters best and strata?

                  Comment


                  • #10
                    This depends a bit on your application. If you just want to look at mapped positions for peak calling or something similar it is probably not necessary to remove everthing that has another match somewhere else in the genome, albeit with one or a few mismatches. for example you might encounter the case that you get a perfect 50bp match for your sequence of interest, and another match somewhere else which has say 3 mismatches. Using --best you could report the best alignment, however -m 1 would remove the sequence completely as it has more than one valid alignment (even though one has no mismatches and the other one has got 3 mismatches). Reporting the best sequence or a few of them (with the -k <int> option) will probably require some extra filtering afterwards, whereas -m 1 is a quick and safe option for absolutely unique matches.

                    Just try a few parameters and look at the alignment stats, sequences removed due to -m and so on until you are happy with the outcome.

                    Comment


                    • #11
                      Consider also that if have a read that has 1 match with 0 errors plus, say, 3 additional matches with 1 error than:
                      "-m 1 --best --strata" will report the match with 0 errors (because the match is unique in the best error stratum)
                      While:
                      "-m 1" or "-m 1 --best" will suppress all the alignments and report the read as unmapped.

                      So "-m 1" without "--best --strata" gives the strongest guarantee that a match is unique, although in my opinion it is too conservative.

                      Please correct me if I'm wrong...

                      Dario

                      Comment


                      • #12
                        Originally posted by dariober View Post
                        Consider also that if have a read that has 1 match with 0 errors plus, say, 3 additional matches with 1 error than:
                        "-m 1 --best --strata" will report the match with 0 errors (because the match is unique in the best error stratum)
                        While:
                        "-m 1" or "-m 1 --best" will suppress all the alignments and report the read as unmapped.

                        So "-m 1" without "--best --strata" gives the strongest guarantee that a match is unique, although in my opinion it is too conservative.

                        Please correct me if I'm wrong...

                        Dario
                        As the manual said "A stronger form of uniqueness is enforced when -m is specified but --best and --strata are not.", your conclusion is right. It is interesting when you play with "strata". In your case, it is not a real "unique" read since it was mapped to 3 locations, but when you add "best and strata" option, bowtie only looked at the "best basket" and found only one alignment (0 error), then bowtie thought it followed the constraint of "-m 1", and finally reported this alignment in the output. So, for common sense, if you want "unique" read, you should not add "best" and "strata" options with "-m 1" since it will distort the view of bowtie for "valid" alignments.
                        I dislike the concept of "strata", it is not flexible at all comparing to map/alignment quality. I wonder why bowtie can not output something like "map quality".
                        One more thing is "unique" read depends on your criteria, the number of mismatch you set also takes an effect on that. In your case, if you set number of mismatch to 0 and m to 1, the latter two alignments are not valid at all, bowtie will report it as a "unique" read; if you set it to 1 and m to 1, without "best and strata" option, bowtie will not report it at all, of course will not think it as a "unique" read also.
                        Last edited by xinwu; 12-16-2010, 10:27 PM.

                        Comment

                        Latest Articles

                        Collapse

                        • seqadmin
                          Strategies for Sequencing Challenging Samples
                          by seqadmin


                          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                          03-22-2024, 06:39 AM
                        • seqadmin
                          Techniques and Challenges in Conservation Genomics
                          by seqadmin



                          The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                          Avian Conservation
                          Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                          03-08-2024, 10:41 AM

                        ad_right_rmr

                        Collapse

                        News

                        Collapse

                        Topics Statistics Last Post
                        Started by seqadmin, Yesterday, 06:37 PM
                        0 responses
                        12 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, Yesterday, 06:07 PM
                        0 responses
                        10 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 03-22-2024, 10:03 AM
                        0 responses
                        52 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 03-21-2024, 07:32 AM
                        0 responses
                        68 views
                        0 likes
                        Last Post seqadmin  
                        Working...
                        X