 07-16-2014, 12:54 AM #1 sazz Member   Location: Istanbul, Turkey Join Date: Oct 2012 Posts: 28 Variance - basic statistics Hello all, I'm sorry for my very naive and basic question, but I am trying to understand a couple of graphs: dispersion, M vs A etc, and I am a little confused about "variance" term. When you check the formula of variance it is the average of the squared differences from the mean. So can I say, genes with high FPKM values tend to have "high variance" and also they are more dispersed relative to low expressed genes? (But I guess, high variance in high FPKM is not a problem when you plot a negative binomial distribution graph to calculate the significance of differential expression) But this also sounds odd because without thinking the math part, I am tempted to say, low expressed genes generally are not significant in differential expression analyses due to the "variability" between FPKM values I guess, there is a misconception here (for me) as I think the variability in "percentage". Moreover, this variability defines the shape of the negative binomial distribution, if it will be more squeezed or spread, used for statistical testing, right? :/ Sorry for asking about basic statistics. I would appreciate if one could explain briefly. Thanks!
 07-17-2014, 08:36 AM #2 csmatyi Member   Location: Nebraska Join Date: Oct 2011 Posts: 25 Hello everyone, I have a statistics question: I have a data set: 1 1 1 0 1 1 0 1 0 0 0 0 0 0 1 0 0 I want to measure how well the 1's accrue at the top of the list, that is, how well the 1's and 0's separate. The above list should have a high value compoared to a random one: 1 0 1 0 1 0 1 0 ... What kind of test do I need for this? Thanks!
 07-25-2014, 05:45 AM #3 TiborNagy Senior Member   Location: Budapest Join Date: Mar 2010 Posts: 329 Hi sazz! Variance in gene expression is not depends on FPKM. It is just a statistical measure of the replicates. csmatyi: I think you need khi-square test.