SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Bioinformatics (http://seqanswers.com/forums/forumdisplay.php?f=18)
-   -   Script to remove gap-only sites from fasta alignment? (http://seqanswers.com/forums/showthread.php?t=4059)

kmkocot 02-15-2010 04:21 PM

Script to remove gap-only sites from fasta alignment?
 
Hi all,

I am trying to find a script or a program that I can call in a pipeline that can remove gap-only sites in a multiple sequence alignment. Can you think of anything I can use? I'd also like to
delete the columns at either end of each alignment where there are only 2 or fewer sequences with non-gap characters if you can think of anything that can do that.

Thanks!
Kevin

maubp 02-23-2010 02:19 AM

I can picture how I would solve this using Biopython, provided the whole alignment can be loaded into RAM.

How big are the alignments (number of columns, number of rows)?

nilshomer 02-23-2010 09:16 AM

Quote:

Originally Posted by kmkocot (Post 14072)
Hi all,

I am trying to find a script or a program that I can call in a pipeline that can remove gap-only sites in a multiple sequence alignment. Can you think of anything I can use? I'd also like to
delete the columns at either end of each alignment where there are only 2 or fewer sequences with non-gap characters if you can think of anything that can do that.

Thanks!
Kevin

Given your alignments in the SAM format, you can remove all alignments with indels with the C-program "dbamfilter -i". You could also easily parse the CIGAR field in the SAM format for any "I" or "D" operators.

What format are you working with?

kmkocot 02-23-2010 09:52 AM

Hi maubp,

The alignments vary but all have fewer than 30 taxa and fewer than 1000 amino acids.

Thanks,
Kevin

kmkocot 02-23-2010 10:50 AM

Hi maubp,

The alignments vary but all have fewer than 30 taxa and fewer than 1000 amino acids.

Thanks,
Kevin


All times are GMT -8. The time now is 01:51 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.