SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
How to choose Tophat/Cufflinks Library Type for SOLiD data? mart555 Bioinformatics 3 10-13-2011 10:35 AM
How to evaluate the RNA-Seq analysis software I choose? Huijuan Bioinformatics 1 04-26-2011 10:11 PM
Which BS-seq analysis tool to choose? zeam Epigenetics 1 11-19-2010 12:12 AM
SOAP2, BWA... which one to choose dingxiaofan1 Bioinformatics 2 10-21-2010 12:00 AM

Reply
 
Thread Tools
Old 12-19-2011, 04:29 PM   #1
hajime
Member
 
Location: Taipei, Taiwan

Join Date: Mar 2011
Posts: 14
Question How to choose aligners?

Dear all:

For some reasons, I need align my short read sequence as following conditions:

refseq: ATCCGATTGCCTCCAAATGCCCTAAATCGTA
my_sq: ATCC-AT-GCCTC-AAATGCCC-AAA-CG-A

(1) for the first 18 nt (red colored) from 5':
a. set as a seed
b. allow 3 mismatches (shown as red "-")
(2) Allow total 6 mismatches (rest mismatches as blue "-")

I tried to find some aligners, but I cannot find a good one.
For example, if I use bowtie with -n and allow 3 mismatch on seed, i cannot set the parameters for total 6 mismatches allowed.

Do you have any suggestions? Thanks a lot!!

Best,
Yi
__________________
Yi John Huang (PhD student)
886-3-2118800 ext. 3731
Graduate Institute of Biomedical Science, Chang Gung University
hajime is offline   Reply With Quote
Old 12-20-2011, 03:05 AM   #2
kga1978
Senior Member
 
Location: Boston, MA

Join Date: Nov 2010
Posts: 100
Default

Give Mosaik a try - adjust -mm and -gop/-gap.
kga1978 is offline   Reply With Quote
Old 12-20-2011, 03:10 AM   #3
tonybolger
Senior Member
 
Location: berlin

Join Date: Feb 2010
Posts: 156
Default

Quote:
Originally Posted by hajime View Post
Dear all:

For some reasons, I need align my short read sequence as following conditions:

refseq: ATCCGATTGCCTCCAAATGCCCTAAATCGTA
my_sq: ATCC-AT-GCCTC-AAATGCCC-AAA-CG-A
Are these deletes or mismatches? If they are deletes, you can't use bowtie - it does substitution mismatches only, at least for version 1.

BWA may do what you want, but generally NGS aligners don't handle reads with many mismatches / indels all that well.
tonybolger is offline   Reply With Quote
Old 12-20-2011, 11:57 PM   #4
hajime
Member
 
Location: Taipei, Taiwan

Join Date: Mar 2011
Posts: 14
Default

@Kga1978: Thanks for your suggestion. I'll try it.

@Tonybolger: the "-" indicates mismatches only (not small indel or other variant types). I'm sorry to let you feel confused. By the way, I'm not sure how the BWA can do what I want. Could you tell me more detail about that? thanks a lot
__________________
Yi John Huang (PhD student)
886-3-2118800 ext. 3731
Graduate Institute of Biomedical Science, Chang Gung University
hajime is offline   Reply With Quote
Old 12-21-2011, 12:38 AM   #5
tonybolger
Senior Member
 
Location: berlin

Join Date: Feb 2010
Posts: 156
Default

Quote:
Originally Posted by hajime View Post
@Kga1978: Thanks for your suggestion. I'll try it.

@Tonybolger: the "-" indicates mismatches only (not small indel or other variant types). I'm sorry to let you feel confused.
OK, but normally '-' is used to as a gap-filler when there's a delete in a sequence.

Quote:
Originally Posted by hajime View Post
By the way, I'm not sure how the BWA can do what I want. Could you tell me more detail about that? thanks a lot
A quick look at the BWA manual suggests that the combination "-n 6 -l 18 -k 3" will probably do what i think you want (you might also need/want to disable gaps).
tonybolger is offline   Reply With Quote
Old 12-21-2011, 01:03 AM   #6
hajime
Member
 
Location: Taipei, Taiwan

Join Date: Mar 2011
Posts: 14
Default

Quote:
Originally Posted by tonybolger View Post
OK, but normally '-' is used to as a gap-filler when there's a delete in a sequence.


A quick look at the BWA manual suggests that the combination "-n 6 -l 18 -k 3" will probably do what i think you want (you might also need/want to disable gaps).
Thanks for your kind and quick reply.

Actually, I did read BWA manual before I reply my last post.
However, I think I misunderstood and got confused about the meaning of "-n" parameter.

According to the description on the BWA website:
-------------------------
-n NUM Maximum edit distance if the value is INT, or the fraction of missing alignments given 2% uniform base error rate if FLOAT. In the latter case, the maximum edit distance is automatically chosen for different read lengths.
-------------------------

Based on your reply, I'd like to know whether you consider that "-n 6" and no gap allowed is the same as 6 mismatches in the read.

Thanks again!
__________________
Yi John Huang (PhD student)
886-3-2118800 ext. 3731
Graduate Institute of Biomedical Science, Chang Gung University
hajime is offline   Reply With Quote
Old 12-21-2011, 01:49 AM   #7
tonybolger
Senior Member
 
Location: berlin

Join Date: Feb 2010
Posts: 156
Default

Quote:
Originally Posted by hajime View Post
Based on your reply, I'd like to know whether you consider that "-n 6" and no gap allowed is the same as 6 mismatches in the read.
I guess that it is, but i would suggest you test it to make sure it does what you want.
tonybolger is offline   Reply With Quote
Old 12-21-2011, 06:28 AM   #8
HESmith
Senior Member
 
Location: Bethesda MD

Join Date: Oct 2009
Posts: 505
Default

I'm not certain if BFAST would work with such a small seed (you would need to modify the indexes), but it's very tolerant of mismatches.
HESmith is offline   Reply With Quote
Old 12-21-2011, 08:23 PM   #9
hajime
Member
 
Location: Taipei, Taiwan

Join Date: Mar 2011
Posts: 14
Default

Because I'm waiting for sequencing done, I cannot test any suggestions using real data right now. I'll tried to generate some fake reads or download others' data for furthered test.

Thanks for all the suggestions. Thank you guys!!
__________________
Yi John Huang (PhD student)
886-3-2118800 ext. 3731
Graduate Institute of Biomedical Science, Chang Gung University
hajime is offline   Reply With Quote
Old 12-21-2011, 09:47 PM   #10
marcowanger
Senior Member
 
Location: Hong Kong

Join Date: Dec 2008
Posts: 350
Default

take a look at http://seqanswers.com/forums/showthread.php?t=15200
__________________
Marco
marcowanger is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:01 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO