I need to set up some large scale assessments of lots of predicted transcripts from transcriptome assemblies for tests of positive selection. Does anyone have any specific suggestions on preferred approaches/tools for generating lots of multiple sequence aligns and running tests for positive selection on them (i.e. PAML)? Basically, I would like to take an input sequences, generate a multiple sequence alignment for it with genes from a selected group of species, clean-up and filter those MSAs for any downstream difficulties (i.e. end-trimming of alignments or dealing with gap) and then, given a tree for these species, test that alignment for positive selection (presumably using PAML). I have found a couple automated approaches to multiple sequence alignment editting/optimization (which is probably the part I'm most concerned about getting right automating) as well as looked into just programming it all myself in BioPython, but I would be curious to hear about any experiences/suggestions anyone would have with regard to this issue and if they have found a tools that they like with regard to this type of application.
Thanks in advance for any thoughts or suggestions.
Thanks in advance for any thoughts or suggestions.