Hi, my dear guys, I have a question about ω ratio calculation by codeml for my specific purpose.
I have constructed a ML tree (for a gene family), and I only focus on the substitutions along the internal branches of the tree.
Using user-specified branch model in codeml could calculate the substitution rates and ω ratios for the internal branches of the whole tree or a specific clades. However, here, I want to calculate the ω ratios of the internal branches connecting the terminal leaves and/or other internal nodes among which nucleotide divergences are less than a specific threshold, like 5%, 10%, 15%, 25%... For different threshholds, it will be better to use loops of pragramming languages.
That is to say, suppose a tree contains three major binary clades ((A:0.125, B:0.125), C:0.175), and every clade contains many short binary branches (mean length, 0.03; range, 0.01-0.05). So, for each clade, divergences are from 0.02 to 0.10; for the whole tree, divergences are from 0.02 to 0.35. When I focus on all the sub-clades containing sequences among which divegences are less than 0.05 (these seqences may be from A, B, and C, respectively), I can specify a foreground ω value for all the internal branches of the focused sub-clades; and when I set the threshold to 0.35, the estimated ω represents the value from the all the internal branches of the whole tree. These values for different thresholds are what I want to calculate, but in practice I can hardly find any rules to mark the foreground #1 to the right branches in the tree file in batch.
My tree have a more complex situation, at least it has much more branches. So I need some good ideas to solve the problem using perl, or other tools.
All of your comments are welcome! Thanks in advance....
I have constructed a ML tree (for a gene family), and I only focus on the substitutions along the internal branches of the tree.
Using user-specified branch model in codeml could calculate the substitution rates and ω ratios for the internal branches of the whole tree or a specific clades. However, here, I want to calculate the ω ratios of the internal branches connecting the terminal leaves and/or other internal nodes among which nucleotide divergences are less than a specific threshold, like 5%, 10%, 15%, 25%... For different threshholds, it will be better to use loops of pragramming languages.
That is to say, suppose a tree contains three major binary clades ((A:0.125, B:0.125), C:0.175), and every clade contains many short binary branches (mean length, 0.03; range, 0.01-0.05). So, for each clade, divergences are from 0.02 to 0.10; for the whole tree, divergences are from 0.02 to 0.35. When I focus on all the sub-clades containing sequences among which divegences are less than 0.05 (these seqences may be from A, B, and C, respectively), I can specify a foreground ω value for all the internal branches of the focused sub-clades; and when I set the threshold to 0.35, the estimated ω represents the value from the all the internal branches of the whole tree. These values for different thresholds are what I want to calculate, but in practice I can hardly find any rules to mark the foreground #1 to the right branches in the tree file in batch.
My tree have a more complex situation, at least it has much more branches. So I need some good ideas to solve the problem using perl, or other tools.
All of your comments are welcome! Thanks in advance....
Comment