SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
bowtie error: extra parameters specified BioSlayer Bioinformatics 1 10-07-2011 10:38 AM
Wrong of Tophat's bowtie parameters, a bug chenyao Bioinformatics 3 08-23-2011 05:06 AM
Tophat's bowtie parameters deepsea Bioinformatics 11 07-14-2011 05:14 PM
BWA, BOWTIE: what parameters for different analysis (ChIP, RNA, miRNA etc) dukevn Bioinformatics 2 08-12-2010 09:57 AM
Recommended Bowtie Parameters agc SOLiD 0 05-10-2010 03:44 AM

Reply
 
Thread Tools
Old 12-16-2010, 12:12 AM   #1
khb
Member
 
Location: Oslo

Join Date: Dec 2010
Posts: 15
Default Bowtie parameters

I'm new in next generation sequencing, and I use Bowtie for the first time. Is there someone that know how to get only the sequences that only have one match? I tried this command.
bowtie hg19 -q input.fastq -m 1 --best --strata --all -S bowtie_out. I also tried without the --all command but both gave equal numbers of hits.
Thanks
khb is offline   Reply With Quote
Old 12-16-2010, 12:26 AM   #2
xinwu
Member
 
Location: Beijing

Join Date: Jul 2010
Posts: 33
Default

Quote:
Originally Posted by khb View Post
I'm new in next generation sequencing, and I use Bowtie for the first time. Is there someone that know how to get only the sequences that only have one match? I tried this command.
bowtie hg19 -q input.fastq -m 1 --best --strata --all -S bowtie_out. I also tried without the --all command but both gave equal numbers of hits.
Thanks
IMO, all option only means all valid alignments ACCORDING TO YOUR CRITERIA. Since you set m option with 1, all option does nothing here. If you set m with 3, and without all option, reads with 2 alignments will only report one (k is 1 by default) alignment, if with all option, all alignments will be reported (2 alignments in this case).

Last edited by xinwu; 12-16-2010 at 12:33 AM.
xinwu is offline   Reply With Quote
Old 12-16-2010, 12:33 AM   #3
khb
Member
 
Location: Oslo

Join Date: Dec 2010
Posts: 15
Default

So it will work without the m 1 command?
khb is offline   Reply With Quote
Old 12-16-2010, 12:34 AM   #4
fkrueger
Senior Member
 
Location: Cambridge, UK

Join Date: Sep 2009
Posts: 625
Default

Indeed, -m 1 will remove all alignments with more than 1 valid alignment meaning --best --strata --all won't have any effect on the alignment results.
fkrueger is offline   Reply With Quote
Old 12-16-2010, 12:38 AM   #5
xinwu
Member
 
Location: Beijing

Join Date: Jul 2010
Posts: 33
Default

fkrueger is right. Other options make no sense since you set m option with 1.
xinwu is offline   Reply With Quote
Old 12-16-2010, 12:40 AM   #6
xinwu
Member
 
Location: Beijing

Join Date: Jul 2010
Posts: 33
Default

Quote:
Originally Posted by khb View Post
So it will work without the m 1 command?
As you said, you want 'unique' alignment rather than all aliginment, just use m with 1 to achieve this goal.
xinwu is offline   Reply With Quote
Old 12-16-2010, 12:52 AM   #7
khb
Member
 
Location: Oslo

Join Date: Dec 2010
Posts: 15
Default

So the conclusion is to use this parameters:
bowtie hg19 -q input.fastq -m 1 -S bowtie_out
but isn't this the same as
bowtie hg19 -q input.fastq -m 1 --best --strata --all -S bowtie_out if the m 1 would overwrite the other parameters?
khb is offline   Reply With Quote
Old 12-16-2010, 12:57 AM   #8
fkrueger
Senior Member
 
Location: Cambridge, UK

Join Date: Sep 2009
Posts: 625
Default

They should do exactly the same thing, yes (as you mentioned yourself omitting --all does not change anything). If you want only absolutely unique sequences, -m 1 is the way to go.

However as you do not specify any other options, bowtie will by default use a seed length of 28 bp and allow 2 mismatches in the seed, plus allow more mismatches after that. Depending on your read length you might want to chose somewhat more stringent mapping parameters so that -m 1 does not remove too many reads (such as -m 1 -l 36 -n 1 or similar).

Best wishes
fkrueger is offline   Reply With Quote
Old 12-16-2010, 01:05 AM   #9
khb
Member
 
Location: Oslo

Join Date: Dec 2010
Posts: 15
Default

Thanks


I think it's strange ; it doesn't seem like it is unique sequences.

The read length is 50 bp. Do you think I should have changed the other parameters then? I should use the parameters best and strata?
khb is offline   Reply With Quote
Old 12-16-2010, 02:10 AM   #10
fkrueger
Senior Member
 
Location: Cambridge, UK

Join Date: Sep 2009
Posts: 625
Default

This depends a bit on your application. If you just want to look at mapped positions for peak calling or something similar it is probably not necessary to remove everthing that has another match somewhere else in the genome, albeit with one or a few mismatches. for example you might encounter the case that you get a perfect 50bp match for your sequence of interest, and another match somewhere else which has say 3 mismatches. Using --best you could report the best alignment, however -m 1 would remove the sequence completely as it has more than one valid alignment (even though one has no mismatches and the other one has got 3 mismatches). Reporting the best sequence or a few of them (with the -k <int> option) will probably require some extra filtering afterwards, whereas -m 1 is a quick and safe option for absolutely unique matches.

Just try a few parameters and look at the alignment stats, sequences removed due to -m and so on until you are happy with the outcome.
fkrueger is offline   Reply With Quote
Old 12-16-2010, 02:44 AM   #11
dariober
Senior Member
 
Location: Cambridge, UK

Join Date: May 2010
Posts: 311
Default

Consider also that if have a read that has 1 match with 0 errors plus, say, 3 additional matches with 1 error than:
"-m 1 --best --strata" will report the match with 0 errors (because the match is unique in the best error stratum)
While:
"-m 1" or "-m 1 --best" will suppress all the alignments and report the read as unmapped.

So "-m 1" without "--best --strata" gives the strongest guarantee that a match is unique, although in my opinion it is too conservative.

Please correct me if I'm wrong...

Dario
dariober is offline   Reply With Quote
Old 12-16-2010, 09:08 PM   #12
xinwu
Member
 
Location: Beijing

Join Date: Jul 2010
Posts: 33
Default

Quote:
Originally Posted by dariober View Post
Consider also that if have a read that has 1 match with 0 errors plus, say, 3 additional matches with 1 error than:
"-m 1 --best --strata" will report the match with 0 errors (because the match is unique in the best error stratum)
While:
"-m 1" or "-m 1 --best" will suppress all the alignments and report the read as unmapped.

So "-m 1" without "--best --strata" gives the strongest guarantee that a match is unique, although in my opinion it is too conservative.

Please correct me if I'm wrong...

Dario
As the manual said "A stronger form of uniqueness is enforced when -m is specified but --best and --strata are not.", your conclusion is right. It is interesting when you play with "strata". In your case, it is not a real "unique" read since it was mapped to 3 locations, but when you add "best and strata" option, bowtie only looked at the "best basket" and found only one alignment (0 error), then bowtie thought it followed the constraint of "-m 1", and finally reported this alignment in the output. So, for common sense, if you want "unique" read, you should not add "best" and "strata" options with "-m 1" since it will distort the view of bowtie for "valid" alignments.
I dislike the concept of "strata", it is not flexible at all comparing to map/alignment quality. I wonder why bowtie can not output something like "map quality".
One more thing is "unique" read depends on your criteria, the number of mismatch you set also takes an effect on that. In your case, if you set number of mismatch to 0 and m to 1, the latter two alignments are not valid at all, bowtie will report it as a "unique" read; if you set it to 1 and m to 1, without "best and strata" option, bowtie will not report it at all, of course will not think it as a "unique" read also.

Last edited by xinwu; 12-16-2010 at 09:27 PM.
xinwu is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:32 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO