View Single Post
Old 02-23-2010, 09:16 AM   #3
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

Quote:
Originally Posted by kmkocot View Post
Hi all,

I am trying to find a script or a program that I can call in a pipeline that can remove gap-only sites in a multiple sequence alignment. Can you think of anything I can use? I'd also like to
delete the columns at either end of each alignment where there are only 2 or fewer sequences with non-gap characters if you can think of anything that can do that.

Thanks!
Kevin
Given your alignments in the SAM format, you can remove all alignments with indels with the C-program "dbamfilter -i". You could also easily parse the CIGAR field in the SAM format for any "I" or "D" operators.

What format are you working with?
nilshomer is offline   Reply With Quote