View Single Post
Old 08-07-2013, 09:01 AM   #1
Senior Member
Location: Ottawa

Join Date: Apr 2011
Posts: 130
Default Giant alignment, high identity, which model for phylogeny?


My goal is a phylogeny of multiple isolates, showing me which isolate is closer to which.

I got an organism from which I did population genomics from a few distant geographic locations. The genome size is about 7-10mb.

I did denovo assemblies using MIRA, for all of my isolates. I picked the best assembly, concatenated all the contigs, and mapped the reads of the other isolate on top of it to generate a new consensus for each of the other isolates.

Now, because the species is heterozygous, I picked a cutoff value of 85% when calling basepairs for the consensus. This should get heterozygous loci to be called as an ambiguity. I now took the consensus of all isolates, and aligned it using MAUVE. I trimmed out all sites that had ambiguities, thus removing heterozygous sites.

I am left with a very long alignment, still about 7-10mb, and only a few thousand sites having any variability whatsoever, spaced out pretty consistently.

Now for the phylogeny, i picked a simple F model, 100 BS, estimated I and G, phyml.

Any thoughts on this? It would be really helpful for some advice, what might I have omitted? Is PHYML he best for this kind of analysis, or should I try bayesian, and if so, mr bayes, phylobayes or even beagle? Are there any alternatives to MAUVE?

Thank you for your help,
AdrianP is offline   Reply With Quote